System, method and article of manufacture for a simulator plug-in for co-simulation purposes

ABSTRACT

A system, method and article of manufacture are provided for equipping a simulator with plug-ins. In general, a first simulator written in a first programming language is executed for generating a first model and a second simulator written in a second programming language is executed to generate a second model so that a co-simulation may be performed utilizing the first model and the second model. The first simulator interfaces with the second simulator via a plug-in.

FIELD OF THE INVENTION

[0001] The present invention relates to programmable hardwarearchitectures and more particularly to programming field programmablegate arrays (FPGA's).

BACKGROUND OF THE INVENTION

[0002] It is well known that software-controlled machines provide greatflexibility in that they can be adapted to many different desiredpurposes by the use of suitable software. As well as being used in thefamiliar general purpose computers, software-controlled processors arenow used in many products such as cars, telephones and other domesticproducts, where they are known as embedded systems.

[0003] However, for a given function, a software-controlled processor isusually slower than hardware dedicated to that function. A way ofovercoming this problem is to use a special software-controlledprocessor such as a RISC processor which can be made to function morequickly for limited purposes by having its parameters (for instancesize, instruction set etc.) tailored to the desired functionality.

[0004] Where hardware is used, though, although it increases the speedof operation, it lacks flexibility and, for instance, although it may besuitable for the task for which it was designed it may not be suitablefor a modified version of that task which is desired later. It is nowpossible to form the hardware on reconfigurable logic circuits, such asField Programmable Gate Arrays (FPGA's) which are logic circuits whichcan be repeatedly reconfigured in different ways. Thus they provide thespeed advantages of dedicated hardware, with some degree of flexibilityfor later updating or multiple functionality.

[0005] In general, though, it can be seen that designers face a problemin finding the right balance between speed and generality. They canbuild versatile chips which will be software controlled and thus performmany different functions relatively slowly, or they can deviseapplication-specific chips that do only a limited set of tasks but dothem much more quickly.

SUMMARY OF THE INVENTION

[0006] A system, method and article of manufacture are provided forequipping a simulator with plug-ins. In general, a first simulatorwritten in a first programming language is executed for generating afirst model and a second simulator written in a second programminglanguage is executed to generate a second model so that a co-simulationmay be performed utilizing the first model and the second model. Thefirst simulator interfaces with the second simulator via a plug-in.

[0007] In one aspect of the present invention, the accuracy and speed ofthe co-simulation may be user-specified. In another aspect, the firstsimulator may be cycle-based and the second simulator may beevent-based.

[0008] In a further aspect, the co-simulation may include interleavedscheduling. In an additional aspect of the present invention, theco-simulation may include fully propagated scheduling. In a furtheraspect, the simulations may be executed utilizing a plurality ofprocessors.

[0009] In even another aspect, the first simulator may be executed aheadof or behind the second simulator. In yet an additional aspect, thefirst simulator is coupled to the second simulator via a network.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010] The invention may be better understood when consideration isgiven to the following detailed description thereof. Such descriptionmakes reference to the annexed drawings wherein:

[0011]FIG. 1 is a schematic diagram of a hardware implementation of oneembodiment of the present invention;

[0012]FIG. 2 illustrates a design flow overview, in accordance with oneembodiment of the present invention;

[0013]FIG. 3 illustrates the Handel-C development environment, inaccordance with one embodiment of the present invention;

[0014]FIG. 4 illustrates a graphical user interface shown if one startsthe program with an empty workspace;

[0015]FIG. 5 illustrates a graphical user interface used to create aproject, in accordance with one embodiment of the present invention;

[0016]FIG. 6 illustrates the various types of new projects, inaccordance with one embodiment of the present invention;

[0017]FIG. 7 illustrates a breakpoint, in accordance with one embodimentof the present invention;

[0018]FIG. 8 illustrates a project settings interface, in accordancewith one embodiment of the present invention;

[0019]FIGS. 9A, 9B, and 9C illustrate available settings;

[0020]FIG. 10 illustrates a configurations graphical user interface, inaccordance with one embodiment of the present invention;

[0021]FIG. 11 illustrates a file view interface, in accordance with oneembodiment of the present invention;

[0022]FIG. 12 illustrates a file properties, in accordance with oneembodiment of the present invention;

[0023]FIG. 13 illustrates a workspace interface and the associatedicons, in accordance with one embodiment of the present invention;

[0024]FIG. 14 illustrates a version test interface, in accordance withone embodiment of the present invention;

[0025]FIG. 15 illustrate a browse and associated results interface, inaccordance with one embodiment of the present invention;

[0026]FIGS. 16A and 16B illustrate browsing commands, in accordance withone embodiment of the present invention;

[0027]FIG. 17 is a table of editing commands, in accordance with oneembodiment of the present invention;

[0028]FIG. 18 is a table of regular expressions, in accordance with oneembodiment of the present invention;

[0029]FIG. 19 is a table of various project files, in accordance withone embodiment of the present invention;

[0030]FIG. 20 illustrates a GUI for customizing the interface, inaccordance with one embodiment of the present invention;

[0031]FIG. 20A illustrates a method for compiling a computer program forprogramming a hardware device;

[0032]FIG. 21 illustrates a build interface, in accordance with oneembodiment of the present invention;

[0033]FIG. 22 illustrates table showing a build menu, in accordance withone embodiment of the present invention;

[0034]FIG. 22A illustrates a method for debugging a computer program, inaccordance with one embodiment of the present invention;

[0035]FIGS. 23A and 23B illustrate the various commands associated withthe debug menu, in accordance with one embodiment of the presentinvention;

[0036]FIG. 24 illustrates a table showing the various windows associatedwith the debugger interface, in accordance with one embodiment of thepresent invention;

[0037]FIG. 25 illustrates a variables window interface, in accordancewith one embodiment of the present invention;

[0038]FIG. 26 illustrates the current positioning function blib, and therelated call stack window;

[0039]FIG. 27 illustrates a threads window interface, in accordance withone embodiment of the present invention;

[0040]FIG. 28 illustrates a variables window interface, in accordancewith one embodiment of the present invention;

[0041]FIG. 29 illustrates a breakpoints window interface, in accordancewith one embodiment of the present invention;

[0042]FIGS. 30 and 31 illustrate a table showing various differencesbetween Handel-C and the conventional C programming language, inaccordance with one embodiment of the present invention;

[0043]FIG. 32 illustrates a table of types, type operators and objects,in accordance with one embodiment of the present invention;

[0044]FIG. 33 illustrates a table of statements, in accordance with oneembodiment of the present invention;

[0045]FIG. 34 illustrates a table of expressions, in accordance with oneembodiment of the present invention;

[0046]FIG. 35 illustrates a net list reader settings display, inaccordance with one embodiment of the present invention;

[0047]FIGS. 36 and 37 illustrate a tool settings display, in accordancewith one embodiment of the present invention;

[0048]FIG. 38 illustrates the wires that would be produced whenspecifying floating wire names, in accordance with one embodiment of thepresent invention;

[0049]FIG. 39 illustrates an interface between Handel-C and VHDL forsimulation, in accordance with one embodiment of the present invention;

[0050]FIGS. 40A and 40B illustrate a table of possible specifications,in accordance with one embodiment of the present invention;

[0051]FIG. 41 illustrates the use of various VHDL files, in accordancewith one embodiment of the present invention;

[0052]FIG. 41A illustrates a method for equipping a simulator withplug-ins;

[0053]FIGS. 42A and 42B illustrate various function calls and thevarious uses thereof, in accordance with one embodiment of the presentinvention;

[0054]FIG. 43 illustrates a plurality of possible values and meaningsassociated with libraries of the present invention;

[0055]FIG. 44 shows how the synchronization works when single-steppingthe two projects in simulation;

[0056]FIG. 44A illustrates a pair of simulators, in accordance with oneembodiment of the present invention;

[0057]FIG. 44B illustrates a cosimulation arrangement includingprocesses and DLLs;

[0058]FIG. 44C illustrates an example of a simulator reengagement, inaccordance with one embodiment of the present invention;

[0059]FIG. 44D illustrates a schematic of exemplary cosimulationarchitecture;

[0060] FIGS. 45A and summarize the options available on the compiler;

[0061]FIGS. 46A and 46B illustrate various commands and debugs, inaccordance with one embodiment of the present invention;

[0062]FIGS. 47A through 47C illustrate various icons that may beutilized, in accordance with one embodiment of the present invention;

[0063]FIG. 48 illustrates the various raw file bit numbers and thecorresponding color bits;

[0064]FIG. 49 illustrates the manner in which branches that completeearly are forced to wait for the slowest branch before continuing;

[0065]FIG. 50 illustrates the link between parallel branches, inaccordance with one embodiment of the present invention;

[0066]FIG. 51 illustrates the scope of variables, in accordance with oneembodiment of the present invention

[0067]FIGS. 52, 53 and 54 illustrate a table of operators, statements,and macros respectively, along with alternate meanings thereof;

[0068]FIG. 55 illustrates a compiler, in accordance with one embodimentof the present invention;

[0069]FIG. 56 illustrates the various specifications for the interfacesof the present invention;

[0070]FIG. 57 illustrates a table showing the ROM entries, in accordancewith one embodiment of the present invention;

[0071]FIG. 57A illustrates a method for using a dynamic object in aprogramming language;

[0072]FIG. 57A-1 illustrates a method for using extensions to executecommands in parallel;

[0073]FIG. 57A-2 illustrates a method for parameterized expressions, inaccordance with various embodiments of the present invention;

[0074]FIGS. 58A and 58B illustrate a summary of statement timings, inaccordance with one embodiment of the present invention;

[0075]FIG. 59 illustrates various I/O based on clock cycles, inaccordance with one embodiment of the present invention;

[0076]FIG. 60 illustrates a table showing the various locations, inaccordance with one embodiment of the present invention;

[0077]FIG. 61 illustrates the various family names, in accordance withone embodiment of the present invention;

[0078]FIG. 62 illustrates a timing diagram showing a signal, inaccordance with one embodiment of the present invention;

[0079]FIG. 63 illustrates a timing diagram showing a SSRAM read andwrite, in accordance with one embodiment of the present invention;

[0080]FIG. 64 illustrates a timing diagram showing a SSRAM read cycleusing generated RAMCLK, in accordance with one embodiment of the presentinvention;

[0081]FIG. 65 illustrates a timing diagram showing read-cycle from aflow-through SSRAM within a Handel-C design, in accordance with oneembodiment of the present invention;

[0082]FIG. 66 illustrates a timing diagram showing complete write cycle,in accordance with one embodiment of the present invention;

[0083]FIG. 67 illustrates a timing diagram showing complete read cycle,in accordance with one embodiment of the present invention;

[0084]FIG. 68 illustrates a timing diagram showing complete cycle, inaccordance with one embodiment of the present invention;

[0085]FIG. 69 illustrates a timing diagram showing a cycle for a writeto external RAM, in accordance with one embodiment of the presentinvention;

[0086]FIG. 70 illustrates a timing diagram showing a cycle for a readfrom external RAM, in accordance with one embodiment of the presentinvention;

[0087]FIG. 71 illustrates a timing diagram showing a cycle for a writeto external RAM, in accordance with one embodiment of the presentinvention;

[0088]FIG. 72 illustrates a timing diagram showing a cycle for a readfrom external RAM, in accordance with one embodiment of the presentinvention;

[0089]FIG. 73 illustrates a timing diagram showing a cycle for a writeto external RAM, in accordance with one embodiment of the presentinvention;

[0090]FIG. 74 illustrates a timing diagram showing a cycle for a readfrom external RAM, in accordance with one embodiment of the presentinvention;

[0091]FIG. 75 is a table of pre-defined interface sorts, in accordancewith one embodiment of the present invention;

[0092]FIG. 76 illustrates a timing diagram, in accordance with oneembodiment of the present invention;

[0093]FIG. 76A is a flowchart showing a method for providing a versatileinterface;

[0094]FIG. 77 illustrates the manner in which an interface is specified,in accordance with one embodiment of the present invention;

[0095]FIGS. 78A through 78C illustrate a table showing the specificationof various keywords, in accordance with one embodiment of the presentinvention;

[0096]FIG. 78D illustrates the manner in which an pin outs arespecified, in accordance with one embodiment of the present invention;

[0097]FIG. 79 illustrates the various signals employed by the presentinvention;

[0098]FIG. 80 illustrates a read waveform representative of a cycle, inaccordance with one embodiment of the present invention;

[0099]FIG. 81 illustrates a waveform representative of a write cycle, inaccordance with one embodiment of the present invention;

[0100]FIG. 82 illustrates a table that lists the most common types thatmay be associated with a variable, in accordance with one embodiment ofthe present invention;

[0101]FIG. 83 illustrates a table that lists all prefixes to the abovetypes for different architectural object types, in accordance with oneembodiment of the present invention;

[0102]FIG. 84 illustrates a table that lists all statements in theHandel-C language, in accordance with one embodiment of the presentinvention;

[0103]FIGS. 85A and 85B illustrate a table that lists all operators inthe Handel-C language, in accordance with one embodiment of the presentinvention;

[0104]FIGS. 86A through 86E illustrate a table that lists keywords, inaccordance with one embodiment of the present invention;

[0105]FIG. 87A illustrates escape codes and their associated meanings,in accordance with one embodiment of the present invention;

[0106]FIG. 87B illustrates a method for distributing cores, inaccordance with one embodiment of the present invention;

[0107]FIG. 87C illustrates a method for using a library map during thedesign of cores, in accordance with one embodiment of the presentinvention;

[0108]FIG. 87D illustrates a method for providing polymorphism usingpointers, in accordance with one embodiment of the present invention;

[0109]FIG. 87E illustrates a method for generating libraries utilizingpre-compiler macros, in accordance with one embodiment of the presentinvention;

[0110]FIG. 87F illustrates a method for mimicking object orientedprogramming utilizing pointers in a programmable hardware architecture,in accordance with one embodiment of the present invention;

[0111]FIG. 88 illustrates an application program interface, inaccordance with one embodiment of the present invention, in accordancewith one embodiment of the present invention;

[0112]FIG. 89 illustrates that the physical layer is divided into afurther two sections, in accordance with one embodiment of the presentinvention;

[0113]FIG. 90 is a schematic diagram of the application layer, physicallayer, and user domain, in accordance with one embodiment of the presentinvention;

[0114]FIG. 91 shows a typical execution flow for a function, inaccordance with one embodiment of the present invention;

[0115]FIG. 92 shows a typical address packet, in accordance with oneembodiment of the present invention;

[0116]FIG. 93 illustrates a Trace and Pattern window, in accordance withone embodiment of the present invention; and

[0117]FIG. 94 illustrates several toolbar icons and their functions, inaccordance with one embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0118] A preferred embodiment of a system in accordance with the presentinvention is preferably practiced in the context of a personal computersuch as an IBM compatible personal computer, Apple Macintosh computer orUNIX based workstation. A representative hardware environment isdepicted in FIG. 1, which illustrates a typical hardware configurationof a workstation in accordance with a preferred embodiment having acentral processing unit 110, such as a microprocessor, and a number ofother units interconnected via a system bus 112.

[0119] The workstation shown in FIG. 1 includes a Random Access Memory(RAM) 114, Read Only Memory (ROM) 116, an I/O adapter 118 for connectingperipheral devices such as disk storage units 120 to the bus 112, a userinterface adapter 122 for connecting a keyboard 124, a mouse 126, aspeaker 128, a microphone 132, and/or other user interface devices suchas a touch screen (not shown) to the bus 112, communication adapter 134for connecting the workstation to a communication network (e.g., a dataprocessing network) and a display adapter 136 for connecting the bus 112to a display device 138.

[0120] The workstation typically has resident thereon an operatingsystem such as the Microsoft Windows NT or Windows/95 Operating System(OS), the IBM OS/2 operating system, the MAC OS, or UNIX operatingsystem. Those skilled in the art may appreciate that the presentinvention may also be implemented on platforms and operating systemsother than those mentioned.

[0121] In one embodiment, the hardware environment of FIG. 1 mayinclude, at least in part, a field programmable gate array (FPGA)device. For example, the central processing unit 110 may be replaced orsupplemented with an FPGA. Use of such device provides flexibility infunctionality, while maintaining high processing speeds.

[0122] Examples of such FPGA devices include the XC2000™ and XC3000™families of FPGA devices introduced by Xilinx, Inc. of San Jose, Calif.The architectures of these devices are exemplified in U.S. Pat. Nos.4,642,487; 4,706,216; 4,713,557; and 4,758,985; each of which isoriginally assigned to Xilinx, Inc. and which are herein incorporated byreference for all purposes. It should be noted, however, that FPGA's ofany type may be employed in the context of the present invention.

[0123] An FPGA device can be characterized as an integrated circuit thathas four major features as follows.

[0124] (1) A user-accessible, configuration-defining memory means, suchas SRAM, PROM, EPROM, EEPROM, anti-fused, fused, or other, is providedin the FPGA device so as to be at least once-programmable by deviceusers for defining user-provided configuration instructions. StaticRandom Access Memory or SRAM is of course, a form of reprogrammablememory that can be differently programmed many times. ElectricallyErasable and reProgrammable ROM or EEPROM is an example of nonvolatilereprogrammable memory. The configuration-defining memory of an FPGAdevice can be formed of mixture of different kinds of memory elements ifdesired (e.g., SRAM and EEPROM) although this is not a popular approach.

[0125] (2) Input/Output Blocks (IOB's) are provided for interconnectingother internal circuit components of the FPGA device with externalcircuitry. The IOB's' may have fixed configurations or they may beconfigurable in accordance with user-provided configuration instructionsstored in the configuration-defining memory means.

[0126] (3) Configurable Logic Blocks (CLB's) are provided for carryingout user-programmed logic functions as defined by user-providedconfiguration instructions stored in the configuration-defining memorymeans.

[0127] Typically, each of the many CLB's of an FPGA has at least onelookup table (LUT) that is user-configurable to define any desired truthtable,—to the extent allowed by the address space of the LUT. Each CLBmay have other resources such as LUT input signal pre-processingresources and LUT output signal post-processing resources. Although theterm ‘CLB’ was adopted by early pioneers of FPGA technology, it is notuncommon to see other names being given to the repeated portion of theFPGA that carries out user-programmed logic functions. The term, ‘LAB’is used for example in U.S. Pat. No. 5,260,611 to refer to a repeatedunit having a 4-input LUT.

[0128] (4) An interconnect network is provided for carrying signaltraffic within the FPGA device between various CLB's and/or betweenvarious IOB's and/or between various IOB's and CLB's. At least part ofthe interconnect network is typically configurable so as to allow forprogrammably-defined routing of signals between various CLB's and/orIOB's in accordance with user-defined routing instructions stored in theconfiguration-defining memory means.

[0129] In some instances, FPGA devices may additionally include embeddedvolatile memory for serving as scratchpad memory for the CLB's or asFIFO or LIFO circuitry. The embedded volatile memory may be fairlysizable and can have 1 million or more storage bits in addition to thestorage bits of the device's configuration memory.

[0130] Modern FPGA's tend to be fairly complex. They typically offer alarge spectrum of user-configurable options with respect to how each ofmany CLB's should be configured, how each of many interconnect resourcesshould be configured, and/or how each of many IOB's should beconfigured. This means that there can be thousands or millions ofconfigurable bits that may need to be individually set or cleared duringconfiguration of each FPGA device.

[0131] Rather than determining with pencil and paper how each of theconfigurable resources of an FPGA device should be programmed, it iscommon practice to employ a computer and appropriate FPGA-configuringsoftware to automatically generate the configuration instruction signalsthat may be supplied to, and that may ultimately cause an unprogrammedFPGA to implement a specific design. (The configuration instructionsignals may also define an initial state for the implemented design,that is, initial set and reset states for embedded flip flops and/orembedded scratchpad memory cells.)

[0132] The number of logic bits that are used for defining theconfiguration instructions of a given FPGA device tends to be fairlylarge (e.g., 1 Megabits or more) and usually grows with the size andcomplexity of the target FPGA. Time spent in loading configurationinstructions and verifying that the instructions have been correctlyloaded can become significant, particularly when such loading is carriedout in the field.

[0133] For many reasons, it is often desirable to have in-systemreprogramming capabilities so that reconfiguration of FPGA's can becarried out in the field.

[0134] FPGA devices that have configuration memories of thereprogrammable kind are, at least in theory, ‘in-system programmable’(ISP). This means no more than that a possibility exists for changingthe configuration instructions within the FPGA device while the FPGAdevice is ‘in-system’ because the configuration memory is inherentlyreprogrammable. The term, ‘in-system’ as used herein indicates that theFPGA device remains connected to an application-specific printed circuitboard or to another form of end-use system during reprogramming. Theend-use system is of course, one which contains the FPGA device and forwhich the FPGA device is to be at least once configured to operatewithin in accordance with predefined, end-use or ‘in the field’application specifications.

[0135] The possibility of reconfiguring such inherently reprogrammableFPGA's does not mean that configuration changes can always be made withany end-use system. Nor does it mean that, where in-system reprogrammingis possible, that reconfiguration of the FPGA can be made in timelyfashion or convenient fashion from the perspective of the end-use systemor its users. (Users of the end-use system can be located either locallyor remotely relative to the end-use system.)

[0136] Although there may be many instances in which it is desirable toalter a pre-existing configuration of an ‘in the field’ FPGA (with thealteration commands coming either from a remote site or from the localsite of the FPGA), there are certain practical considerations that maymake such in-system reprogrammability of FPGA's more difficult thanfirst apparent (that is, when conventional techniques for FPGAreconfiguration are followed).

[0137] A popular class of FPGA integrated circuits (IC's) relies onvolatile memory technologies such as SRAM (static random access memory)for implementing on-chip configuration memory cells. The popularity ofsuch volatile memory technologies is owed primarily to the inherentreprogrammability of the memory over a device lifetime that can includean essentially unlimited number of reprogramming cycles.

[0138] There is a price to be paid for these advantageous features,however. The price is the inherent volatility of the configuration dataas stored in the FPGA device. Each time power to the FPGA device is shutoff, the volatile configuration memory cells lose their configurationdata. Other events may also cause corruption or loss of data fromvolatile memory cells within the FPGA device.

[0139] Some form of configuration restoration means is needed to restorethe lost data when power is shut off and then re-applied to the FPGA orwhen another like event calls for configuration restoration (e.g.,corruption of state data within scratchpad memory).

[0140] The configuration restoration means can take many forms. If theFPGA device resides in a relatively large system that has a magnetic oroptical or opto-magnetic form of nonvolatile memory (e.g., a hardmagnetic disk)—and the latency of powering up such a optical/magneticdevice and/or of loading configuration instructions from such anoptical/magnetic form of nonvolatile memory can be tolerated—then theoptical/magnetic memory device can be used as a nonvolatileconfiguration restoration means that redundantly stores theconfiguration data and is used to reload the same into the system's FPGAdevice(s) during power-up operations (and/or other restoration cycles).

[0141] On the other hand, if the FPGA device(s) resides in a relativelysmall system that does not have such optical/magnetic devices, and/or ifthe latency of loading configuration memory data from such anoptical/magnetic device is not tolerable, then a smaller and/or fasterconfiguration restoration means may be called for.

[0142] Many end-use systems such as cable-TV set tops, satellitereceiver boxes, and communications switching boxes are constrained byprespecified design limitations on physical size and/or power-up timingand/or security provisions and/or other provisions such that they cannotrely on magnetic or optical technologies (or on network/satellitedownloads) for performing configuration restoration. Their designsinstead call for a relatively small and fast acting, non-volatile memorydevice (such as a securely-packaged EPROM IC), for performing theconfiguration restoration function. The small/fast device is expected tosatisfy application-specific criteria such as: (1) being securelyretained within the end-use system; (2) being able to store FPGAconfiguration data during prolonged power outage periods; and (3) beingable to quickly and automatically re-load the configuration instructionsback into the volatile configuration memory (SRAM) of the FPGA deviceeach time power is turned back on or another event calls forconfiguration restoration.

[0143] The term ‘CROP device’ may be used herein to refer in a generalway to this form of compact, nonvolatile, and fast-acting device thatperforms ° Configuration-Restoring On Power-up services for anassociated FPGA device. Unlike its supported, volatilely reprogrammableFPGA device, the corresponding CROP device is not volatile, and it isgenerally not ‘in-system programmable’. Instead, the CROP device isgenerally of a completely nonprogrammable type such as exemplified bymask-programmed ROM IC's or by once-only programmable, fuse-based PROMIC's. Examples of such CROP devices include a product family that theXilinx company provides under the designation ‘Serial ConfigurationPROMs’ and under the trade name, XC1700D.TM. These serial CROP devicesemploy one-time programmable PROM (Programmable Read Only Memory) cellsfor storing configuration instructions in nonvolatile fashion.

[0144] A preferred embodiment is written using Handel-C. Handel-C is aprogramming language marketed by Celoxica Limited. Handel-C is aprogramming language that enables a software or hardware engineer totarget directly FPGAs (Field Programmable Gate Arrays) in a similarfashion to classical microprocessor cross-compiler development tools,without recourse to a Hardware Description Language. This allows thedesigner to directly realize the raw real-time computing capability ofthe FPGA.

[0145] Handel-C allows one to use a high-level language to programFPGAs. It makes it easy to implement complex algorithms by using asoftware-based language rather than a hardware architecture-basedlanguage. One can use all the power of reconfigurable computing in FPGAswithout needing to know the details of the FPGAs themselves. A programmay be written in Handel-C to generate all required state machines,while one can specify storage requirements down to the bit level. Aclock and clock speed may be assigned for working with the simple butexplicit model of one clock cycle per assignment. A Handel-C macrolibrary may be used for bit manipulation and arithmetic operations. Theprogram may be compiled and then simulated and debugged on a PC similarto that in FIG. 1. This may be done while stepping through single ormultiple clock cycles.

[0146] When one has designed their chip, the code can be compileddirectly to a netlist, ready to be used by manufacturers' place androute tools for a variety of different chips.

[0147] As such, one can design hardware quickly because he or she canwrite high-level code instead of using a hardware description language.Handel-C optimizes code, and uses efficient algorithms to generate thelogic hardware from the program. Because of the speed of development andthe ease of maintaining well-commented high-level code, it allows one touse reconfigurable computing easily and efficiently.

[0148] Handel-C has the tight relationship between code and hardwaregeneration required by hardware engineers, with the advantages ofhigh-level language abstraction. Further features include:

[0149] C-like language allows one to program quickly

[0150] Architecture specifiers allow one to define RAMs, ROMs, buses andinterfaces.

[0151] Parallelism allows one to optimize use of the FPGA

[0152] Close correspondence between the program and the hardware

[0153] Easy to understand timing model

[0154] Full simulation of owner hardware on the PC

[0155] Display the contents of registers every clock cycle during debug

[0156] Rapid prototyping

[0157] Convert existing C programs to hardware

[0158] Works with manufacturers' existing tools

[0159] Rapid reconfiguration

[0160] Logic estimation tool highlights code inefficiencies in coloredWeb pages

[0161] Device-independent programs

[0162] Generates EDIF and XNF formats (and XBLOX macros)

[0163] Handel-C is thus designed to enable the compilation of programsinto synchronous hardware; it is aimed at compiling high levelalgorithms directly into gate level hardware. The Handel-C syntax isbased on that of conventional C so programmers familiar withconventional C may recognize almost all the constructs in the Handel-Clanguage. Sequential programs can be written in Handel-C just as inconventional C but to gain the most benefit in performance from thetarget hardware its inherent parallelism may be exploited. Handel-Cincludes parallel constructs that provide the means for the programmerto exploit this benefit in his applications. The compiler compiles andoptimizes Handel-C source code into a file suitable for simulation or anet list which can be placed and routed on a real FPGA.

[0164] More information regarding the Handel-C programming language willnow be set forth. For further information, reference may be made to“EMBEDDED SOLUTIONS Handel-C Language Reference Manual: Version 3,”“EMBEDDED SOLUTIONS Handel-C User Manual: Version 3.0,” “EMBEDDEDSOLUTIONS Handel-C Interfacing to other language code blocks: Version3.0,” and “EMBEDDED SOLUTIONS Handel-C Preprocessor Reference Manual:Version 2.1,” each authored by Rachel Ganz, and published by EmbeddedSolutions Limited, and which are each incorporated herein by referencein their entirety.

[0165] The present description is divided in a plurality of sections setforth under the headings:

[0166] HANDEL-C COMPILER AND SIMULATOR

[0167] HANDEL-C LANGUAGE

[0168] PREPROCESSOR

[0169] FPGA-BASED CO-PROCESSOR API

[0170] FIXED AND FLOATING POINT LIBRARY

[0171] WAVEFORM ANALYSIS

HANDEL-C Compiler and Simulator

[0172] Conventions

[0173] A number of conventions are used throughout this description.These conventions are detailed below. Hexadecimal numbers appearthroughout this description. The convention used is that of prefixingthe number with ‘0x’ in common with standard C syntax.

[0174] Sections of code or commands that one may type are given intypewriter font as follows:

[0175] “void main( );”

[0176] Information about a type of object one may specify is given initalics as follows:

[0177] “copy SourceFileName DestinationFileName”

[0178] Menu items appear in narrow bold text as follows:

[0179] “insert Project into Workspace”

[0180] Elements within a menu are separated from the menu name by a >soEdit>Find means the Find item in the Edit menu.

[0181] Introduction

[0182] Handel-C is a programming language designed to enable thecompilation of programs into synchronous hardware. The Handel-C compilerand simulator will now be described. The Handel-C language may bedescribed hereinafter in greater detail.

[0183] The present description contains:

[0184] Getting started

[0185] User interface overview

[0186] Compiler and simulator overview

[0187] Examples of compiler and simulator use

[0188] Notes on using Handel-C and porting C code to Handel-C

[0189] Description of interfacing with VHDL code

[0190] Guide to the API (Application Programmers Interface)

[0191] Descriptions of the bitmap to data conversion utilities used bythe

[0192] examples.

[0193] Overview

[0194] Design Flow Overview

[0195]FIG. 2 illustrates a design flow overview 200, in accordance withone embodiment of the present invention. The dotted lines 202 show theextra steps 204 required if one wishes to integrate Handel-C with VHDL.

[0196] Getting Started.

[0197] Introduction

[0198] The present section gives a brief description of how to use theHandel-C compiler and simulator.

[0199] The Handel-C Development Environment

[0200]FIG. 3 illustrates the Handel-C development environment 300, inaccordance with one embodiment of the present invention. The Handel-Cdevelopment environment is a standard Windows development environment.It is in four main parts. The windows and toolbars are standard Windowsdockable windows and customizable toolbars.

[0201] Expected Development Sequence

[0202] The normal development sequence for a single-chip project is asfollows:

[0203] 1. Create a new project.

[0204] 2. Configure the project.

[0205] 3. Add the empty source code files to the project.

[0206] 4. Create source code.

[0207] 5. Link to any required libraries.

[0208] 6. Set up the files for debug.

[0209] 7. Compile the project for debug.

[0210] 8. Debug the project.

[0211] 9. Compile the project for target chip.

[0212] 10. Export the target file to a place and route tool.

[0213] 11. Place and route.

[0214] There is not necessarily information on placing and routingwithin the Handel-C documentation. The steps are described below.

[0215] Invoking the Environment.

[0216] One starts Handel-C by doing one of:

[0217] selecting Start>Programs>Handel-C>Handel-C

[0218] double-clicking on an existing Handel-C workspace file (fileswith the extension .hw)

[0219] double-clicking the Handel-C icon

[0220]FIG. 4 illustrates a graphical user interface 400 shown if onestarts the program with an empty workspace.

[0221] Creating the Project

[0222]FIG. 5 illustrates a graphical user interface 500 used to create aproject, in accordance with one embodiment of the present invention.

[0223] Select New from the File menu.

[0224] Select the Project tab in the dialog that appears.

[0225] One may be asked for the name and location (pathname for thedirectory that it is stored in) for the project. One can look for adirectory by clicking the . . . button to the right of the Location box.

[0226] By default, a new workspace is created for the project in thesame directory as the project. Workspace files have .hw extensions.Project files have .hp extensions. When one starts a new project, onemay have to define its type. FIG. 6 illustrates the various types 600 ofnew projects, in accordance with one embodiment of the presentinvention.

[0227] Common pre-defined project types are supplied with Handel-C.

[0228] Select the appropriate project type from the types listed in theProject pane.

[0229] Click OK.

[0230] Configuring the Project

[0231] Once a person has created a project, one should configure itssettings. These settings define what type of chip is targeted, and howthe compiler, pre-processor and optimizer work. The default settings arecorrect for a new project that one wishes to debug.

[0232] Adding Files to the Project

[0233] Add a Handel-C source file to the new project. This may be onethat a person has already written, or a new, empty one.

[0234] Creating a New File

[0235] Select File>New, and click the Source File tab.

[0236] Select whether it's a header file or a source file in theleft-hand pane.

[0237] Select the project the file should belong to from the drop-downlist of current projects.

[0238] Set the location (the directory path where the file is stored),either by typing the pathname in the box, or selecting a directory byclicking the . . . button.

[0239] The code editor window may open.

[0240] Adding an Existing File

[0241] Select Project>Add to Project>Files and browse the directory treefor the files one wishes to add.

[0242] One can add multiple files from a directory by selecting themall. OR

[0243] Right-click the mouse on the project, and select Add Files toFolder from the shortcut menu.

[0244] Removing Files from a Project

[0245] One can remove files from a project by selecting the file in theworkspace window and pressing the Delete key or selecting Edit>Delete.This does not delete the file from the hard disk.

[0246] Opening an existing source code file does not add it to theproject. It may not be built or compiled. One may explicitly add filesto the project.

[0247] Writing Source Code

[0248] One may write Handel-C source code in the source code editor.Code is indented at the same level as the line above it and is syntaxhighlighted.

[0249] Having a file open in the source code editor does not mean thatit is part of the project. The only files that may be compiled and builtare those that may have been added to the project.

[0250] Setting up for Debug

[0251] There are several methods of coding Handel-C to help one debug aproject.

[0252] They fall into two kinds:

[0253] Code which may automatically be discarded by the compiler if onedoes not compile a project for debug, e.g., the with {infile=“file”}directive

[0254] Code where one supplies alternatives to be compiled for debug andrelease or target compilations. In these cases, one can use the #ifdefDEBUG, #ifdef NDEBUG and #ifdef SIMULATE directives.

[0255] By default, DEBUG and SIMULATE may be defined if one compiles fordebug, and NDEBUG may be defined for all other compilations. Forexample:

[0256] .ifdef SIMULATE

[0257] sim_chan ? var; // Read from simulator

[0258] .else

[0259] HardwareMacroRead(var); // Real HW interface

[0260] .endif

[0261] Summary of coding techniques used for debug:

[0262] Substitute simulator channels for hardware interface channels

[0263] Use the assert directive to stop a compilation if a condition isuntrue.

[0264] Substitute file input for external channel input

[0265] Export the contents of variables into files

[0266] Build and Compile for Debug

[0267] Debug is the default compilation target. It is unlikely that onewould need to make any changes to the project settings at this stage.The compiler creates a file which is in turn compiled into native PCcode using Microsoft Visual C++. This creates the chip simulation.

[0268] To build and compile the project, select Build from the Buildmenu. Messages from the compiler may appear in the Build tab of theoutput window

[0269] Debug and Simulation

[0270] Select Start debug from the Build menu. The Debug menu mayreplace the Build menu. A person can step through the code fromexecution point to execution point. Statements that are completed at theend of the current clock cycle are marked with an arrow.

[0271] The arrows are color coded as follows:

[0272] Yellow current point

[0273] White other points in this thread executed in this cycle

[0274] Grey points in other threads executed in this cycle

[0275] To set a breakpoint, click in the code editor on the line whereone wishes to set the breakpoint and then click the breakpoint button. Ared circle may appear at the beginning of that line. When the debuggerreaches that line, it may stop. FIG. 7 illustrates a breakpoint 700, inaccordance with one embodiment of the present invention.

[0276] Optimize Code as Necessary

[0277] One can examine the depth and speed of the code by compiling withthe -e option selected in the Compiler tab of the Project Settingsdialog. This creates:

[0278] an html file for the project, project.html

[0279] an html file for each file in the project files_c.html.

[0280] These files highlight the code according to the code area andtiming. The project.html file has links to all the html fileshighlighting the source code. It also links to the 5 top areas and 5 topdelays in the project. One can use this as a basis for optimizing thecode. An example of progressive optimization is given later.

[0281] Compile for Release

[0282] When one is satisfied with the project, select Build>Set ActiveConfiguration and choose the type of build required from the availableconfigurations. Release allows one to simulate the project without thedelays inherent in debug. It also allows one to compile Handel-Clibraries without debug information to protect intellectual property.Target is one of VHDL and EDIF. These are files that are ready to beplaced and routed. By default, most optimizations may be turned on.

[0283] Project Settings

[0284]FIG. 8 illustrates a project settings interface 800, in accordancewith one embodiment of the present invention. Project settings definehow projects are compiled and built. Select Project>Settings to see theProject settings dialog box. The different settings 802 are availablevia tabs 804. If one can't see the tab one want, then scroll the tabs byclicking on the arrows 806 at the end of the tabs. Note that some tabsare not available for an empty project. FIGS. 9A, 9B, and 9C illustrateavailable settings 900.

[0285] Independent Settings for Files

[0286] One can create independent settings for a file. A person mightwish to do this if one wanted to change the optimization level for aparticular file. Project settings for a file override the generalproject settings.

[0287] To create settings for a file, open the Project Settings dialog(either right-click the file in the File View and select Settings, orselect Project>Settings).

[0288] Select the name of the file that one wishes to affect in the filepane of the Project Settings dialog.

[0289] Make the appropriate changes.

[0290] Configurations

[0291] There are three types of configuration that one can select fromto build the application

[0292] Debug (default)

[0293] Release

[0294] Target (VHDL, EDIF etc.)

[0295] Debug is used to build a configuration that can be simulated anddebugged on the PC. In debug mode, one can view the contents ofregisters and step through the program's source code.

[0296] Release mode is used to create Handel-C intellectual property(libraries). It creates compiled code that has no debug messages and canbe used in another program. Release mode can also be used for high-speedsimulation.

[0297] In target mode, one gets a list of gates, ready to be placed androuted on an FPGA.

[0298] Defining Configurations

[0299]FIG. 10 illustrates a configurations graphical user interface1000, in accordance with one embodiment of the present invention. Onecan save a particular combination of settings as a project configurationusing the Build>Configurations menu item. This user-definedconfiguration can only be used in the project. Handel-C comes with fourdefault configurations: Build 1002, Debug 1004, VHDL 1006 and EDIF 1008.One can copy one of these configurations and then make changes to it.

[0300] Select Build>Configurations . . .

[0301] Click the Add button in the dialog that appears.

[0302] Enter a name for the new configuration, and select theconfiguration type that one wishes to use as a base in the Copy settingsfrom box.

[0303] More Complex Configurations

[0304] If one knows that he or she is going to have multiple projects(perhaps one needs to have two independent circuits on the same chip),it is better to create a workspace first and then add the projects toit.

[0305] If one has an existing workspace set up, it may be opened.Otherwise, select New from the File menu. Create a new workspace tostore the project(s). One may be asked for its name and location(pathname for the directory that it may be stored in). Either type thepathname in the Location box, or use the . . . button to browse for adirectory. Workspace files have .hw extensions.

[0306] Adding an Existing Project to a Workspace

[0307] Select Insert Project into Workspace from the Project menu.

[0308] Creating a Complex Project

[0309] If a project is a board or system, it may contain subprojects.When one creates a new complex project type (by writing a new .cf file)a dialog box appears when one clicks OK. The New Project Componentsdialog box asks what projects one wishes to use for the components ofthe project. One can either create a new project or select one withinthe workspace from the drop-down list. If the project exists but is notin the workspace, one can add it using the Insert Project button.

[0310] To ensure that the subprojects are built when one builds thecomplex project, he or she can set up the subprojects as dependent.Select Project>Dependencies . . . .

[0311] One may be offered a list of the projects in the workspace. Checkthe ones that are desired to be rebuilt when building the complexproject.

[0312] Dependencies

[0313] Dependencies are used to ensure that files that are not part ofthe project are updated during a build. They also specify the order thatfiles may be compiled and built.

[0314] There are three types of dependencies used in Handel-C:

[0315] Project dependencies

[0316] File dependencies

[0317] External dependencies

[0318] The only one that can be changed directly is ProjectDependencies. The others show information calculated by the compiler.

[0319] Project Dependencies

[0320] The Project>Dependencies . . . dialog allows one to select otherprojects within the workspace that this project is dependent on.Projects listed here may be rebuilt as necessary when the project isrebuilt.

[0321] If one is building a complex project, such as a board or systemthat has several chips on it, he or she can create a separate projectfor each chip, and make the system project dependent upon them.

[0322] File Dependencies

[0323] File dependencies are listed in the file properties. They specifythe user include files that are not included in the project which areneeded to compile and build a selected file. They also specify whatother files within the project may be compiled before this file.

[0324] These dependencies are generated when one compiles a file. Onecan examine them by selecting a file in the File View pane of theworkspace window and typing Alt +Enter or right-clicking the file nameand selecting Properties from the shortcut menu.

[0325] External Dependencies

[0326] The External Dependencies folder appears in the workspace windowafter a project has been built. It contains a list of the header filesrequired by the project that are not included in the project.

[0327] User Interface

[0328] The Workspace Window

[0329] The workspace window contains workspaces and projects. Aworkspace is simply an area that one keeps projects in. It allows one toorganize the files that one need for each project. One could generallyuse one workspace per system (a system is the configuration that one aretargeting).

[0330] A project consists of everything one need to create one or morenet list files ready to be placed and routed on an FPGA, together withthe project settings. Project settings provide information about wherethe files for the project are stored, the target chip for the project,how the compilation may work, and optimization requirements. Projectscan be libraries (compiled Handel-C that is not targeted for aparticular output), cores (a piece of code, such as a function),complete net lists for a chip, boards (net lists for several chips in aspecified configuration) or systems (a combination of boards etc.). Inone embodiment, the core may optionally be compiled to a net list.

[0331] The workspace window has two views:

[0332] File view

[0333] Symbol view

[0334] File View

[0335]FIG. 11 illustrates a file view interface 1100, in accordance withone embodiment of the present invention. File view shows the workspace,its projects, and their source files and folders 1102. If there aremultiple projects in a single workspace, the current project name 1104may be in bold. The file view gives the structure of files in theproject. It has no relationship to the way one has stored files on ahard disk. It allows one to set up dependencies (what files are neededfor this project and what files or projects they depend upon) and managethe project by seeing which files are used within it.

[0336] One can adjust the space given to the Object and Info columns1106 by dragging the edge of the column heading. Double-clicking on asource file opens it in the code editor. Double clicking on anythingelse expands or contracts that branch of the workspace tree.Right-clicking on a filename or directory gives one a list ofcommonly-used commands.

[0337] File Properties

[0338]FIG. 12 illustrates a file properties 1200, in accordance with oneembodiment of the present invention. To operate, one may select a fileor directory in the workspace window 1202, then select View>Properties.This displays:

[0339] Inputs The tools used and the source file pathname(s) that toolrequires

[0340] Outputs The output files generated by the specified tool

[0341] Dependencies The header files (dependencies) this file requires.

[0342] Managing the Project Files

[0343] One can order the files within the project into folders. Thesefolders are only used to organize the files. They do not exist asfolders on the hard disk and have no effect on the directory structure.

[0344] Select Project>Add to Project>New Folder

[0345] Type the name of the folder in the dialog box that appears

[0346] Type the extension for the file types it should contain. One canleave the box blank.

[0347] Click OK

[0348] A new folder appears in the file view window.

[0349] Drag the files that are desired to be moved across to the folder.

[0350] Symbol View

[0351]FIG. 13 illustrates a workspace interface 1300 and the associatedicons, in accordance with one embodiment of the present invention. Asymbol is anything defined by the user (functions, variables, macros,typedefs, enums etc.). Symbol view allows one to see what one has in aproject. It is empty before one builds a project. When one builds theproject with the browse information enabled (set by default in the Debugconfiguration), a symbol table is created that allows one to examine thesymbols defined and used in the project. Selecting the Symbol View tab1302 of the workspace window then shows icons 1304 representing logicand architectural variables, functions and procedures.

[0352] Each icon is identified by its definition and use (references).External symbols (external variables and function names) appear inalphabetical order.

[0353] Double-clicking on a symbol expands it if it is expandable: ifnot, it opens the relevant source code file, with the appropriate linetagged Local symbols appear in alphabetical order within the function orprocedure where they are defined.

[0354]FIG. 14 illustrates a version test interface 1400, in accordancewith one embodiment of the present invention.

[0355] The Source Browser

[0356]FIG. 15 illustrate a browse and associated results interface 1500,in accordance with one embodiment of the present invention. One canbrowse for definitions and references 1502 without using symbol view.When one selects the Source Browser command from the Tool menu, one isgiven a Browse dialog box.

[0357] Enter the symbol being searched for, and a dialog box may beshown giving its definition and references to it.

[0358] Browse Commands

[0359] If one selects a symbol name in a source file, one can use thebrowse commands and buttons to find its definitions and references inall the files used in a project. FIGS. 16A and 16B illustrate browsingcommands 1600, in accordance with one embodiment of the presentinvention.

[0360] Editing

[0361] The Code Editor

[0362] The code editor is a simple editor that resides in its ownwindow. The syntax is color coded. One can change the color codes byselecting the Format tab from the Tools>Options dialog box. The defaultvalues are:

[0363] Comments green

[0364] Handel-C keywords blue

[0365] Number black

[0366] String black

[0367] Operator black

[0368] One can use standard editing commands within the code window.These are accessible from the Edit menu. FIG. 17 is a table of editingcommands 1700, in accordance with one embodiment of the presentinvention. The Edit menu also has the Bookmarks and Browse sub-menus andthe Breakpoints command.

[0369] Find Commands

[0370] Handel-C has simple Find and Replace commands that allow one tosearch for text in the current file, and the Find in Files command,which allows one to search for a string in all the files in a directory.The output from this command can be sent to two different window panes,allowing one to view the results of two searches. To choose which paneis selected check or uncheck the Output to pane 2 box in the Find inFiles dialog.

[0371] These searches work line by line, which means that one cannotmatch text that spans more than one line. One can also search usingregular expressions. To do this, check Regular expression in the Findand Find in Files dialog box. The regular expressions supported arelisted below. FIG. 18 is a table of regular expressions 1800, inaccordance with one embodiment of the present invention.

[0372] Bookmarks Submenu

[0373] The Bookmarks submenu allows one to set and clear bookmarkswithin the files. Once one has set bookmarks in the file, one can movethrough the bookmarks by selecting Next Bookmark (F2) or PreviousBookmark (Shift F2).

[0374] To Set a Bookmark

[0375] Select the line where one wishes to place the bookmark

[0376] Press the toggle bookmark button OR

[0377] Right-click the line and select Toggle bookmark from the shortcutmenu that appears OR

[0378] Select Edit>Bookmarks>Toggle Bookmark (Ctrl F2).

[0379] To Move to a Bookmark

[0380] Select Edit>Bookmarks>Next Bookmark (F2) or press the nextbookmark button to move forward through the bookmarks

[0381] Select Edit>Bookmarks>Previous Bookmark (Shift F2) or press theprevious bookmark button to move backwards.

[0382] To Remove a Bookmark

[0383] Select the line where one wishes to clear the bookmark Press thetoggle bookmark button OR

[0384] Right-click the line and select Toggle bookmark from the shortcutmenu that appears OR

[0385] Select Edit>Bookmarks>Toggle Bookmark (Control F2).

[0386] To Remove all Bookmarks

[0387] Select Edit>Bookmarks>Clear All Bookmarks (Control Shift F2) orpress the clear all bookmarks button to clear all bookmarks

[0388] Breakpoints Command

[0389] The Breakpoints command allows one to set, enable and disablebreakpoints. Breakpoints are fully discussed hereinafter in greaterdetail.

[0390] Breakpoints Alt+F9 Display a dialogue box for editing thebreakpoints list for this project.

[0391] Browse Submenu

[0392] The Browse submenu allows one to find definitions of andreferences to selected variables or other symbols. If one makes a changeto a variable, this is a quick way of finding everywhere that thevariable is used.

[0393] To Find the Definition of a Variable or Other Symbol

[0394] Select the symbol name in an edit window.

[0395] Select Edit>Browse>Go to Definition or click the button.

[0396] To Find the First Reference to a Variable or Other Symbol

[0397] Select the symbol name in an edit window.

[0398] Select Edit>Browse>Go to Reference or click the button.

[0399] To Move Through the References to and Definitions of a Variableor Other Symbol

[0400] Select the symbol name in an edit window.

[0401] To move forward, select Edit>Browse>Next Definition Reference orclick the button

[0402] To move backward, select Edit>Browse>Previous DefinitionReference or click the button

[0403] To Return to the Position Before Starting Browsing

[0404] Select Edit>Browse>Pop Context or click the button

[0405] Saving Changes

[0406] If one has not saved changes to a file, an asterisk appears afterthe filename on the title bar. One may be asked if he or she wishes tosave changes when a file is closed.

[0407] Files and Paths

[0408] The current directory is the directory containing the currentproject's .hp file. All relative pathnames are calculated from thatcurrent directory.

[0409] Project Files Generated

[0410] When one creates a workspace, a directory is created for thatworkspace. Projects within the workspace may be in the same directory ora sub-directory. When one builds a project, a directory is created forthe build results. The default directory name is the name of the buildtype (Debug, Release, VHDL or EDIF). One can change this by setting theOutput Directory values in the General tab of the Project Settingsdialog.

[0411] These are the files built for a workspace prog.hw, containing aproject example 1, consisting of one Handel-C file, prog.c that has beencompiled for simulation. The files may all be stored in the Debugfolder. FIG. 19 is a table of various project files 1900, in accordancewith one embodiment of the present invention.

[0412] Search Paths

[0413] Code files that one has added to the project workspace may becompiled and built. Header files may only be found by the pre-processorif they exist on a known path.

[0414] The directories searched are in the following order:

[0415] 1. Directory containing the Handel-C file that has the #includedirective (if within quotes).

[0416] 2. Directories listed in Project>Settings>Preprocessor>Additionalinclude directories (in the order specified)

[0417] 3. Directories listed in the Directories pane of theTools>Options dialog (in the order specified)

[0418] 4. Directories in the HANDELC_CPPFLAGS environment variable (inthe order specified)

[0419] Windows and Toolbars

[0420] The Handel-C user interface has standard scrollable windows andcustomizable toolbars. One can customize:

[0421] The way the edit and build environment is laid out (position ofworkspace and output windows etc.)

[0422] The way document windows are laid out (this is specific to eachworkspace)

[0423] The debugger layout (the way windows look when you're in thedebugger)

[0424] These layouts are stored. The edit and build and the debuglayouts are kept for the copy of Handel-C. If one changes them, he orshe changes them for every project. The document window layout is keptwith the workspace, and can change whenever he or she changes thecurrent workspace.

[0425] Window Types

[0426] Document windows are movable within the Handel-C window. One canresize them and drag them about. Docking windows can either be docked atone of the window margins, or can float above the other windows. When awindow is docked it has no title-bar. If one has docked a code editorwindow, the file name appears in brackets after the project title in theHandel-C title bar. To float a docked window, double-click its border.To dock a floating window, either double-click its border, or drag itstitle bar to a docking position.

[0427] Splitting Windows

[0428] One can split text windows by dragging the small box immediatelyabove the vertical scroll bar.

[0429] The Windows Menu

[0430] The windows menu allows one to control the size and display ofediting windows. It has the following commands:

[0431] New window Create a copy of the current window

[0432] Split Split the window into two or four views.

[0433] Docking view Enable/disable docking view of selected dockablewindow

[0434] Close Close current window

[0435] Close All Close all windows

[0436] Next Move to next pane of a split window

[0437] Previous Move to previous pane of a split window

[0438] Cascade Cascade all open windows with title bars visible

[0439] Tile Horizontally Display all windows, splitting the viewing areahorizontally

[0440] Tile Vertically Display all windows, splitting the viewing areavertically

[0441] Arrange Icons Arrange minimized window icons along bottom ofviewing area

[0442] Windows . . . Open Windows dialog

[0443] Windows Dialog

[0444] The Windows dialog gives the names of all open edit windows. Aperson can make one of them the current window, or select a group ofwindows to be saved, closed or tiled.

[0445] Full Screen Display

[0446] The Full Screen command on the Edit menu displays the code editorpane at maximum size. The normal menu bars and toolbars are not visible.To return to a normal view, click the no full screen button.

[0447] Toolbars

[0448] When one starts Handel-C, toolbars appear under the menu bar.They are:

[0449] The standard toolbar

[0450] Build mini-bar

[0451] Browse mini-bar

[0452] Debug mini-bar

[0453] Bookmark mini-bar.

[0454] Standard Toolbar Buttons

[0455] The standard toolbar buttons are a frequently used subset fromthe File, Edit and View menus.

[0456] Changing Toolbars

[0457] The toolbars in Handel-C are dockable. They can be docked at oneof the edges of the Handel-C window, or they can float. One can change atoolbar from docked to floating and back by double clicking on it. Onecan move them by dragging the title bar or the double bar.

[0458] The Status Bar

[0459] The status bar is visible at the bottom of the Handel-C window.It displays information about items when the mouse is over them.

[0460] The Tools Menu

[0461] The tools menu has the Source Browser command and commands tocustomize the copy of Handel-C.

[0462] The Source Browser Command

[0463] The Source Browser command allows one to search for names ofvariables and functions within the code. It directs one to theirdefinition and lists references to them. Its use is more fully discussedhereinafter in greater detail.

[0464] Customizing the Interface

[0465]FIG. 20 illustrates a GUI 2000 for customizing the interface, inaccordance with one embodiment of the present invention. The Customize .. . command brings up the Customize dialog. The Toolbar tab 2002 allowsone to change the display of toolbars utilizing various options 2004, asshown. To use, one may check a toolbar in the toolbar pane to displayit, uncheck it to hide it.

[0466] Show Tooltips Check this to popup the purpose of a button whenthe mouse cursor is over it.

[0467] Cool Look Check this to make the buttons appear two-dimensional

[0468] Large Buttons Check this to increase the button size

[0469] Large Icons Check this to have large icons on large buttons.

[0470] The Command tab allows one to add menus and buttons to thetoolbar and menu bar. The right-hand pane displays the buttons and Menucommands available.

[0471] Select the button or menu that one wishes to add and drag it tothe toolbar or menu bar. If one drags a menu command to a toolbar, itappears as a button. If one drags it to an empty area, it appears as anew floating window.

[0472] Removing Buttons and Menus

[0473] One can remove buttons from a toolbar by opening theTools>Customize dialog and then dragging them off the toolbar. One canremove menus from the menu bar by opening the Tools>Customize dialog anddragging the menu name off the toolbar.

[0474] To restore a toolbar to its previous state, select the Toolbarstab of the Tools>Customize dialog. Select the toolbar (under theToolbars tab) or the menu (under the Commands tab)

[0475] Options

[0476] The Tools>Options command allows one to set options:

[0477] Editor Set the window options for the editor. Define when filesare saved.

[0478] Tabs Define how tabs are handled and whether Auto-Indent is used.

[0479] Debug Set the default base used to display numbers in the debugwindows. This information is over-ruled by the Handel-C showspecification.

[0480] Format Define the color and font of text and markers in windows.

[0481] Workspace Set the number of recently opened workspaces in theworkspace list.

[0482] Directories Set the directories that may be searched for includeand library files used in projects.

[0483] Editor

[0484] Vertical scroll bar Check to display vertical scroll-bar

[0485] Horizontal scroll bar Check to display horizontal scrollbar

[0486] Automatic window recycling Display files opened by the IDE(integrated development environment) in an existing window

[0487] Selection margin Use a selection margin in the editor window toenable one to select paragraphs, etc.

[0488] Drag and drop text editing Edit by selecting an area, anddragging it to a new position

[0489] Save before running tools Save files before running tools definedin the Tools menu

[0490] Prompt before saving files Ask before saving

[0491] Automatic reload of externally modified files If a file is openin Handel-C, and then modified by something outside Handel-C, loadchanges from disk automatically.

[0492] Tabs

[0493] File type Define settings for specified file types or definedefault settings.

[0494] Tab size Equivalent number of spaces per tab

[0495] Insert spaces/Keep tabs Select whether to use spaces or tabs infile

[0496] Auto indent Check to auto-indent text to above line's indent

[0497] Debug

[0498] Base for numbers Select default display base in debug windows

[0499] Format

[0500] Category Select window type(s) to modify

[0501] Font Select font to display text in

[0502] Size Select display font size

[0503] Colors Select text type to modify

[0504] Foreground: Set foreground color

[0505] Background: Set background color.

[0506] Sample Display sample text in selected settings

[0507] Reset All Return to default settings

[0508] Workspace

[0509] Default workspace list Set number of recent workspaces in theFile>Recent Workspaces command

[0510] Directories

[0511] Show directories for: Select include path list or Library pathlist

[0512] Add or remove directory paths to search for include files orlibrary files.

[0513] Compiler

[0514]FIG. 20A illustrates a method 2050 for a compiler capable ofcompiling a computer program for programming a hardware device. Ingeneral, in operation 2052, a first net list is created with a firstformat based on a computer program. Further, in operation 2054, a secondnet list is created with a second format based on the computer program.In an aspect of the present invention, the first format may includeEDIF. As another aspect, the second format may include VDHL, XNF, etc.It should be noted, however, that any other formats may be employed perthe desires of the user.

[0515] It is important to note that the first net list and the secondnet list are created utilizing a single compiler. Note operation 2056.As an option, the computer program from which the first net list wascreated may be the same as the computer program from which the secondnet list was created. More information regarding the compiler will nowbe set forth.

[0516] The Handel-C compiler compiles and optimizes Handel-C source codeinto a file suitable for simulation or a net list file which can beplaced and routed on a real FPGA. The compiler is normally invokedautomatically when the user selects an option from the Build menu.

[0517] Once the compile has completed, an estimate of the number of NANDgates estimate required to implement the design is displayed in theoutput window. The compiler uses the GNU preprocessor. Flags can bepassed to the preprocessor using the Preprocessor tab of theProject>Settings dialog box. If one wishes to run the compiler from acommand line, one may do so by using the command handelc. A completelist of the command line options is set forth hereinafter.

[0518] The Build Process

[0519]FIG. 21 illustrates a build interface 2100, in accordance with oneembodiment of the present invention. A build happens when:

[0520] one click on the build button 2102.

[0521] one has uncompiled files and one select one of the Start Debugcommands in the Build menu.

[0522] one selects Build or Rebuild All from the Build menu

[0523] This should:

[0524] preprocess header files and compile dependent header files

[0525] compile any files that have been added, changed and saved sincethe last compilation and also compile any files dependent upon them.

[0526] compile all dependent projects.

[0527] link the compiled files together

[0528] calculate the number of gates used

[0529] build a symbol table

[0530] generate a simulatable file or a net list.

[0531] If one changes the configuration for a project, he or she mayneed to compile all the files. Select the Build>Rebuild All command toensure that all the files are recompiled.

[0532] The results of the compilation and build are displayed in theBuild window. Double-clicking an error takes one to the appropriate linein the source file.

[0533] Checking Code Depth and Speed

[0534] One can examine the depth and speed of the code by compilingusing the -e option. This creates:

[0535] an html file for the project, project.html an html file for eachfile in the projectfiles_c.html. These highlight areas of code accordingto how much area or delay may be required to implement it.

[0536] One can look at these files by opening them in any Internetbrowser. project.html.

[0537] The project.html file has links to all the files c.html filesthat highlight the source code. It also links to the 5 top areas and 5top delays in the project.

[0538] file_c.html

[0539] The html versions of the source files show two versions of thesource code. The first is colored according to the area required toimplement the code; the second according to the amount of delay. Coolcolors (blues and greens) indicate a small area or delay; hot colors(red and yellow) show where there are large areas or delays. There arefull color tables at the end of each section. The five largest delaysand areas are underlined and tagged with the number of gates or logiclevels needed. These estimates are only a guide since full place androute is needed to get exact logic area and timing information.

[0540] The Build Menu

[0541]FIG. 22 illustrates table showing a build menu 2200, in accordancewith one embodiment of the present invention.

[0542] Debugger and Simulator

[0543]FIG. 22A illustrates a method 2250 for debugging a computerprogram, in accordance with one embodiment of the present invention. Ingeneral, in operation 2252, a plurality of threads is identified in acomputer program.

[0544] Selection of one of the threads is allowed in operation 2254. Inanother aspect, the thread may be selected by inserting a breakpoint inthe computer program. As may soon become apparent, this or any otherdesired method may be used to carry out the selection. As such, the usercan choose to jump bewteen threads existing in the same clock cycle.Note use of the “follow” command hereinafter.

[0545] The selected thread is then debugged. See operation 2256. In oneaspect of the present invention, a default thread may be initiallydebugged without user action (automatically). As an option, the defaultthread may be a thread that is first encountered in the computerprogram. In a further aspect, the debugging may utilize a clockassociated with the selected thread.

[0546] The simulator thus allows one to test the program without usingreal hardware. It allows one to see the state of every variable(register) in the program at every clock cycle. One can select whichvariables are to be displayed by using the Watch and Variable windows.One can see the current threads running in the Threads window and thecurrent clocks used in the Clocks window. A person can see the currentfunction, and what functions were called to reach it, in the Call Stackwindow.

[0547] One can run the code in the simulator in several ways:

[0548] Run until the end (never ends on a continuous program loop)

[0549] Run until one reaches the current cursor position

[0550] Run until one reaches a user-defined breakpoint

[0551] Step through the code.

[0552] When one is using the debugger one can be running the simulation(run mode) or pausing the simulation (break mode). When the simulationhas paused (in one of the ways given above or by using the Breakcommand) one can easily examine variables, change window displays, orset breakpoints. When the simulation is in run mode, one can onlyobserve.

[0553] When one starts the debugger, a Debug menu appears. FIGS. 23A and23B illustrate the various commands 2200 associated with the debug menu,in accordance with one embodiment of the present invention.

[0554] One can also set breakpoints on valid code lines. When thedebugger reaches a breakpoint it may pause until one requests it tocontinue.

[0555] The Debugger Interface

[0556] The debugger interface consists of a plurality of windows. FIG.24 illustrates a table 2400 showing the various windows associated withthe debugger interface, in accordance with one embodiment of the presentinvention.

[0557] Symbols in the Editor Window

[0558] The statements associated with the current clock tick are markedwith arrows. All of these statements execute together. If there is a parstatement in the code, the execution may split into separate threads,one for each branch of the par statement. The threads execute inparallel. When one is debugging, one can only follow one thread at atime. The current thread has arrows marked in yellow and white. Whitearrows show combinatorial code that may be executed on the next clocktick. A yellow arrow shows the current point.

[0559] The other threads have the points that may be executed on thecurrent clock cycle in dark gray. If one single-steps through theHandel-C code, one may see the arrows move.

[0560] The Variables Window

[0561]FIG. 25 illustrates a variables window interface 2500, inaccordance with one embodiment of the present invention. The Variableswindow always shows the current variables 2502. When their valueschange, the color changes from black to red. The window has two tabs2504, Auto and Locals. The Auto tab shows variables that have beenautomatically selected. They are variables used in the current andprevious statement in the current thread. It also displays return valueswhen one comes out of or step overs a function.

[0562] The Locals tab shows the variables that are local to the currentfunction or macro.

[0563] The Watch windows

[0564] There are four watch windows. One can select variables to bedisplayed in each window, and look at their values at any breakpoint oras one step through the program.

[0565] One can add a variable to the watch window by typing its name.The watch window has an expression evaluator. If one types in anexpression, the result may be evaluated.

[0566] The Call Stack Window

[0567]FIG. 26 illustrates the current positioning function blib 2600,and the related call stack window. The functions called on the way tothe current function are displayed in the Call Stack window. This showsthe current function at the top of the window, and the functions thathave not yet completed beneath.

[0568] The current function in the current thread is marked with ayellow arrow. If multiple threads that are running different functions,the other current functions are marked with green arrows.

[0569] The Threads Window

[0570]FIG. 27 illustrates a threads window interface 2700, in accordancewith one embodiment of the present invention. All threads 2702 aredisplayed in the Threads window.

[0571] The thread column shows the thread ID 2704 (how the simulatoridentifies the thread) The yellow arrow 2706 indicates the currentthread. The grey arrows 2708 indicate threads with the same clock as thecurrent thread. The Detail column gives an outline of the provenance ofthis thread. The picture shows four threads that are branches of thereplicated par in queue.c. They are distinguished here by the par(i=XXX) detail. The Location column tells one the current line number ofthat thread in the code.

[0572] Right-click in the Threads window to see a menu:

[0573] Show Location shows one the source file and scrolls to the rightposition

[0574] Follow tells the debugger to follow that thread (make it thecurrent thread)

[0575] The Clocks Window

[0576]FIG. 28 illustrates a variables window interface 2800, inaccordance with one embodiment of the present invention. All clocks 2802used are displayed in the Clocks window. The current clock is markedwith a yellow arrow. It is identified by the full pathname of the filereferencing it. The clock cycle count 2804 is also displayed in theClocks window. Double-clicking a clock takes one to the clockdefinition.

[0577] Using the Debugger Commands

[0578] One can use the debugger commands to go through every line of thecode, step over functions and macros, or the run the code until abreakpoint has been reached.

[0579] Single Stepping

[0580] The simulator steps through the program, one clock cycle at atime. Essentially, assignments, and reads and writes to channels takeone clock cycle, everything else is ‘free’. In a sequential language,such as ISO-C, one can step through code one line at a time, and onestop at an execution point. Because Handel-C is a parallel language,there can be multiple execution points. Because parallel threads areimplemented as separate pieces of logic, multiple statements may executeon the same clock tick.

[0581] Single stepping through the program does not mean steppingthrough it one line at a time, or one statement at a time.

[0582] One can choose to Step Into, Step Out of or Step Over functionsand macros. If one wants to move forward a single line, rather than acomplete clock cycle, one can use the Advance command.

[0583] Using Breakpoints

[0584]FIG. 29 illustrates a breakpoints window interface 2900, inaccordance with one embodiment of the present invention. If a persondoes not wish to single-step through the code, one can run until he orshe reaches a breakpoint.

[0585] Setting Breakpoints

[0586] Select the line of code where one wishes the simulator to pause.(Use Edit>Find to hunt for known names.)

[0587] Click the breakpoint button OR

[0588] Select Break from the Debug menu. OR

[0589] Right-click the mouse and select Insert Breakpoint

[0590] Disabling Breakpoints

[0591] Breakpoints can be active or inactive. If one wishes to keep abreakpoint but not to stop at it,

[0592] Find the line of code where the breakpoint is set; right-clickthe mouse and select Disable Breakpoint

[0593] All breakpoints are listed in the Edit>Breakpoints dialog box.One can also disable a breakpoint by unchecking its box in this dialog.

[0594] Removing Breakpoints

[0595] Find the line of code where the breakpoint is set.

[0596] Click the breakpoint button OR

[0597] Right-click the mouse and select Remove Breakpoint OR

[0598] Open the breakpoints dialog (Edit>Breakpoints), select the

[0599] breakpoint(s) to be removed and click Remove.

[0600] Breakpoints in Replicated Code

[0601] If one sets a breakpoint in replicated code, a breakpoint may beset in every copy of the code. When one steps through it, the arrows maynot appear to advance, but one can see the thread changing in theThreads window.

[0602] Breakpoints in Macros and Inline Functions

[0603] One cannot set breakpoints in macro expressions. If a person setsa breakpoint in an inline function or a macro procedure, the breakpointmay occur every time that the code is used.

[0604] Following Threads

[0605] The default thread followed is the one that appears first in thecode. One can follow another thread by:

[0606] Selecting the code to follow in the code editor, right-clickingthe mouse and selecting Follow Thread OR

[0607] Opening the Threads window, selecting a thread, right-clickingand selecting Follow Thread OR

[0608] By setting a breakpoint within that thread.

[0609] Setting a breakpoint in a thread makes that the current threadwhen the breakpoint is reached.

[0610] Selecting Clocks

[0611] The clock used is the one associated with the current thread. Onecan change the clock domain followed by:

[0612] following a different thread

[0613] setting a breakpoint within the thread to be followed

[0614] All clocks used are displayed in the Clocks window. The currentclock is marked with a yellow arrow. It is identified by the fullpathname of the file referencing it.

[0615] The current clock cycle count is also displayed in the Clockswindow.

[0616] Following Function Calls

[0617] The way a function has been called is displayed in the Call Stackwindow. This shows the current function at the top of the window, andthe uncompleted functions that called it beneath. The current functionin the current thread is marked with a yellow arrow. If multiple threadsare running different functions, the other current functions are markedwith green arrows. If a function has stopped at a breakpoint, thebreakpoint marker is shown in the Call Stack window.

[0618] Examining Variables

[0619] There are two windows for examining variable values

[0620] Watch

[0621] Variables

[0622] By default variables are displayed in decimal. One can change thebase by right-clicking within the window and selecting a new value fromthe pop-up menu. One can change the display base of an individualvariable using the Handel-C specification with {base=n}. One can turnoff the display of a variable by using the Handel-C specification with{show=0}.

[0623] int 32 pike with {show =0};

[0624] Arrays and structures are displayed with a +button next to thename. Click on this button to display individual array elements orstructure members.

[0625] Configuration

[0626] In debug mode, the project configuration for debug is set bydefault.

[0627] Debug Configuration

[0628] The settings specific to debug are:

[0629] Preprocessor defines the variables DEBUG and SIMULATE. Thisallows one to set up the code (see examples below) according to whethera person is using the simulator, e.g. use simulator channels instead ofreal interfaces.

[0630] Compiler Generate Debug and Generate warning boxes checked

[0631] Linker Output format set to Simulator; Save browse info boxchecked; Generate estimation information option (create html files)switched off.

[0632] Debugger Working directory for debugger set to current (.).

[0633] Optimizations High-level optimization switched on.

[0634] Hardware Embodiments

[0635] If one is approaching Handel-C from a hardware background, oneshould be aware of these points:

[0636] Handel-C is halfway between RTL and a behavioral HDL. It is ahigh-level language that requires one to think in algorithms rather thancircuits.

[0637] Handel-C uses a zero-delay model and a synchronous design style.

[0638] Handel-C is implicitly sequential. Parallel processes may bespecified.

[0639] All code in Handel-C (apart from the simulator chanin and chanoutcommands) can be synthesized. so one may ensure that he or she disablesdebug code when he or she compiles to target real hardware.

[0640] Signals in Handel-C are different from signals in VHDL; they areassigned to immediately, and only hold their value for one clock cycle.

[0641] Handel-C has abstract high-level concepts such as pointers.

[0642] Points of Difference

[0643] If one is an experienced C user, he or she may be caught unawaresby some of the differences between C and Handel-C. The differences aresummarized hereinafter.

[0644]FIGS. 30 and 31 illustrate a table showing various differences3100 between Handel-C and the conventional C programming language, inaccordance with one embodiment of the present invention.

[0645] Porting C to HANDEL-C

[0646] Introduction

[0647] This section illustrates the general process of porting anexisting conventional C routine to Handel-C. The general issues arediscussed first and then illustrated with the particular example of anedge detection routine. This example illustrates the whole conversionprocess from conventional C program to optimized Handel-C program andalso shows how to map conventional C onto real hardware. There is also asection detailing the differences between conventional C and Handel-C.

[0648] General Porting Issues

[0649] In general, there are a number of stages to porting and mapping aconventional C program to hardware. These are:

[0650] 1. Decide on how the software system maps onto the targethardware platform. For example, external RAM connected to the FPGA canbe used to hold buffers used in the conventional C program. This mappingmay also include partitioning the algorithm between multiple FPGAs and,hence, splitting the conventional C into multiple Handel-C programs.

[0651] 2. Convert the conventional C program to Handel-C and use thesimulator to check correctness. Remember that there may be optimizationsthat can be made to the algorithm given that a Handel-C program can useparallelism. For example, one can sort numbers more quickly in parallelby using a sorting network. This form of coarse grain parallelism canprovide massive performance gains so time should be spent on this step.

[0652] 3. Modify code to take advantage of extra operators available inHandel-C. For example, concatenation and bit selection can be used whereconventional C programs may use shifts and masks. Simulate again toensure program is still correct.

[0653] 4. Add fine grain parallelism such as making parallel assignmentsor executing individual instructions in parallel to fine-tuneperformance. Again, simulate to ensure that the program still functionscorrectly.

[0654] 5. Add the hardware interfaces necessary for the targetarchitecture and map the simulator channel communications onto theseinterfaces. If possible, simulate to ensure mapping has been performedcorrectly.

[0655] 6. Use the FPGA place and route tools to generate the FPGAimage(s).

[0656] These steps are obviously guidelines only—some of the stages maynot be relevant to the design or one may require extra stages if thedesign does not fit this example flow. This list provides a startingpoint and guidelines for how to approach the process of porting thecode. A full example follows after the section comparing C and Handel-C.

[0657] One of the most important factors in selecting a goodpartitioning of a program between hardware and software is to take intoaccount the cost of communicating data between the two halves of thepartition. The communication link between the hardware and software isdetermined by a number of parameters particular to a given target. Theseparameters include bandwidth, latency, and (per-message) overhead.

[0658] For some languages, it is possible to determine exactly theamount of data that would be transferred by an operation such as afunction call, since all the data is passed in one direction by thearguments, and in the other direction by the return value. However, manyother languages (including C) pass data implicitly using pointers. Forthese languages static analysis techniques cannot yield usefullyaccurate results. It is in this situation that the techniques presentedare applicable.

[0659] One technique relies on dynamic analysis of the source program.The source program is compiled to platform independent bytecode. Asuitable bytecode interpreter is augmented such that accesses to memory(typically load and store instructions) can be traced. In this way thememory use behavior of each part of the source program can be examinedby executing the program and analyzing the generated trace. A simplisticimplementation of this technique suffers from the problem of generatinga very large amount of profiling data. The present embodiment uses twoalternative techniques to solve this problem:

[0660] 1. During execution of a single function (or set of functionsgrouped as a domain) the present embodiment records a map of all thememory accessed. At the end of execution of the function outputs only acompressed version of this map (compressed using a technique such asrun-length encoding) Since functions may typically tend to use blocks ofmemory in ranges, rather than a fully random access pattern, thisresults in significant savings in the size of the generated output. Theoutput is then analyzed post-hoc to determine where memory transferswould have taken place between domains of a partitioned system.

[0661] 2. Alternatively, some of the analysis can happen on-line duringthe execution of the program. In this case, a memory map of the programis kept which records which functions (or groups of functions) havevalid copies of small ranges of memory (micropages). When a functionreads for an area of memory, this map is checked to see which functionshave a valid copy of the data. If the current function has a valid copyno further action is taken. If no function has a valid copy of the datathen it is taken as coming from an external source function. Otherwise atransfer from one of the other functions to the current function isrecorded, and the map records that the current function now has a validcopy of the micropage. When a write occurs, exactly the same actiontakes place except the ownership of the micropage becomes only thecurrent function, no other functions now possess valid (up-to-date)copies of the data in the given page. The result of the execution of aprogram in this way is a 2-dimensional table recording data transfersfrom functions to functions. This data can then be further analyzed togive estimates for the performance of given partitions, be used todecide partitions, or be presented in a graphical form (such as adirected graph). It has been assumed in the above that the compiled codeis executed within a virtual machine. It is possible via modification tothe compiler to generate native code with appropriate traps on memoryaccesses and calls to functions implemented either of the abovestrategies. This results in an improvement in performance over thebytecode alternative.

[0662] Comparison Between Conventional C and Handel-C

[0663] This section details the types, operators, and statementsavailable in conventional C and Handel-C. The tables should be used toget an idea of which parts of the conventional C program need to bealtered. Differences in implementation between Handel-C and ISO-C:

[0664] Functions may not be recursive.

[0665] Old-style function declarations are not necessarily supported.

[0666] Variable length parameter lists are not necessarily supported.

[0667] One may not necessarily change the width of a variable by casting

[0668] One cannot convert pointer types except to and from void, betweensigned and unsigned and between similar structs

[0669] Floating point is not necessarily supported, but may be supported(optionally) in some embodiments

[0670] Statements in Handel-C may not cause side-effects. This has thefollowing consequences:

[0671] local initializations are not supported.

[0672] the initialization and iteration phases of for loops may bestatements, not expressions.

[0673] shortcut assignments (e.g. +=) may appear as standalonestatements.

[0674] Types, Type Operators and Objects

[0675]FIG. 32 illustrates a table of types, type operators and objects3200, in accordance with one embodiment of the present invention.

[0676] Statements

[0677]FIG. 33 illustrates a table of statements 3300, in accordance withone embodiment of the present invention.

[0678] Expressions

[0679]FIG. 34 illustrates a table of expressions 3400, in accordancewith one embodiment of the present invention.

[0680] In Both/In Conventional C Only/In Handel-C Only

[0681] The edge detector example (C to Handel-C)

[0682] The edge detector consists of a number of versions of the sameapplication that detail the process of porting a conventional Capplication to a Handel-C application. All but the final stage(targeting real hardware) are presented as complete examples that may besimulated with the Handel-C simulator. They are stored as separateprojects within a single workspace.

[0683] The original C code is supplied in source and compiled versions.One can execute this code, and simulate the different versions of theported code. Note that the examples use specific hard-coded filenamesfor the image data. The image data filenames may be exactly the same asthose given in the examples, or the source code may be edited andrecompiled.

[0684] The Original Program

[0685] The example used in this section to illustrate the portingprocess is that of a simple edge detector. Each of the stages outlinedin the previous section is illustrated with complete code listings. Theoriginal conventional C program is given below. #include <stdio.h>#include <stdlib.h> /* * Define name of input/output files */ #defineSourceFileName “../Data/source.raw” #define DestFileName“../Data/dest.raw” /* * Define parameters of image and threshold foredges */ #define WIDTH 256 #define HEIGHT 256 #define THRESHOLD 16 /* *Edge detector procedure */ void edge_detect (unsigned char *Source,unsigned char *Dest) { int x, y;.Targeting Hardware /* * Loop round forevery pixel */ for (y=1; y<HEIGHT; y++) for (x=1; x<WIDTH; x++) { /* *Determine whether there is an edge here */ if (abs (Source [x +y*WIDTH]- Source [x−1 + y*WIDTH] ) >THRESHOLD ∥ abs (Source [x +y*WIDTH] - Source [x + (y−1) *WIDTH] ) >THRESHOLD) Dest [x + y*WIDTH]0xFF; else Dest [x + y*WIDTH] =0; } } /* * Main program */ int main(void) { unsigned char *Source = malloc (WIDTH*HEIGHT); unsigned char*Dest = malloc (WIDTH*HEIGHT); FILE *FilePtr; /* * Read image from file*/ FilePtr = fopen (SourceFileName, “rb”); fread (Source, sizeof(unsigned char), WIDTH*HEIGHT, FilePtr); fclose (FilePtr); /* * Do edgedetection */ edge_detect (Source, Dest); /* * Write results back to file*/ FilePtr = fopen (DestFileName, “wb”); fwrite (Dest, sizeof (unsignedchar), WIDTH*HEIGHT, FilePtr); fclose (FilePtr); return 0; }

[0686] The file reads data from a raw data file into a buffer. Thefunction edge_detect then performs a simple edge detection and storesthe results in a second buffer which is stored in a second file. Theedge detection is performed by subtracting the pixel values for adjacenthorizontal and vertical pixels, taking the absolute values andthresholding the result. The source and destination images are both 8bit per pixel greyscale images. The conventional C source file and acompiled version are provided along with an example image (source.bmp).One can run the program now to see the results. This is done using thefollowing commands:

[0687] 1. Convert the example BMP file to raw data with the bmp2rawutility.

[0688] bmp2raw -b source.bmp source.raw 8bppdest.rgb

[0689] 2. Execute the conventional C edge detector.

[0690] edge_c

[0691] 3. Convert the output from the edge detector back to a BMP fileusing the raw2bmp utility:

[0692] raw2bmp -b 256 dest.raw dest_c.bmp 8bppsrc.rgb

[0693] One can use the standard Windows 98 and NT Paint utility todisplay the source and destination BMP files to compare results.

[0694] First Attempt Handel-C Program

[0695] The first step is to port the conventional C to Handel-C with asfew changes as possible to ensure that the resulting program workscorrectly. The file handling sections of the original program aremodified to read data from a file and write data back to a file usingthe Handel-C simulator. The resulting program is given below.

[0696] The following points should be noted about the port:

[0697] 1. The Source and Dest buffers have been replaced with two RAMs.

[0698] 2. An abs( ) macro expression provided in stdlib.h has been usedto replace the standard C function.

[0699] 3. The x and y variables have been given widths equal to thenumber of address lines required for the RAMs to simplify the index ofthe RAM. Without this, each variable would have to be padded with zerosin its MSBs to avoid a width conflict when accessing the RAM.

[0700] 4. Temporary variables have been used for the three pixels readfrom RAM to avoid the restriction on only one access per RAM per clockcycle. Without these variables, the condition for the if statement wouldrequire multiple accesses to the Source RAM.

[0701] 5. The pixel values may be extended by one bit to ensure thesubtract does not underflow.

[0702] 6. The Input and Output channels are declared to read from andwrite to files for simulation. The file name is given using the withspecification, e.g. chanin unsigned Input with {infile=“./Data/source.dat”};

[0703] To execute the Handel-C code:

[0704] 1. Convert the example BMP file to text data with the bmp2rawutility by typing:

[0705] bmp2raw source.bmp source.dat 8bppdest.rgb

[0706] 2. Open the Handel-C edge detector workspace(Examples/Handel-C/Examples/ExampleC/ExampleC.hw) by double-clicking onit. Build and run the project.

[0707] 3. Convert the output from the edge detector back to a BMP fileusing the raw2bmp utility by typing:

[0708] raw2bmp 256 dest.dat dest vl.bmp 8bppsrc.rgb. Example codeversion1 /***************************************************** *Description * * Handel-C edge detector example program - Firstpass. * * * * To test open the workspace file ‘ExampleC.hw’ . * * ******************************************************* / * * Define aclock */ set clock = external “P1”; /* * Define parameters of image andthreshold for edges */ #define LOG2_WIDTH 8 #define WIDTH 256 #defineLOG2_HEIGHT 8 #define HEIGHT 256 #define THRESHOLD 16 /* * Declare RAMsfor source and destination images */ ram unsigned char Source[WIDTH*HEIGHT] ; ram unsigned char Dest [WIDTH*HEIGHT] ; /* * Declare amacro for absolute value */ macro expr abs (a) = (a<0 ? −a : a) ; /* *Edge detector procedure */ void edge_detect ( ) { unsigned(LOG2_WIDTH+LOG2_HEIGHT) x; unsigned (LOG2_WIDTH+LOG2_HEIGHT) y; int 9Pixel1, Pixel2, Pixel3; /* * Loop round for every pixel */ for (y=1;y<HEIGHT; y++) { for (x=1; x<WIDTH; x++) { Pixel1=(int) (0 @ Source [x +y*WIDTH]); Pixel2=(int) (0 @ Source [x−1 + y*WIDTH]); Pixel3=(int) (0 @Source [x + (y−1) *WIDTH]); /* * Determine whether there is an edge here*/ if (abs(Pixel1 - Pixel2) > THRESHOLD ∥ abs(Pixel1 - Pixel3) >THRESHOLD) { Dest [x + y*WIDTH]=0xFF; } else { Dest [x + y*WIDTH]=0; } }} } /* * Main program */ void main (void) { chanin unsigned Input with{infile = “../Data/source.dat”}; chanout unsigned Output with {outfile =“../Data/dest.dat”}; unsigned (LOG2_WIDTH+LOG2_HEIGHT) i; unsigned(LOG2_WIDTH+LOG2_HEIGHT) j; /* * Read image from file */ for (i=0;i<HEIGHT; i++) for (j=0; j<WIDTH; j++) Input ? Source [j + i*WIDTH];/* * Do edge detection */ edge_detect( ); /* * Write results back tofile */ for (i=0; i<HEIGHT; i++) for (j=0; j<WIDTH; j++) Output ! Dest[j + i*WIDTH]; delay; }

[0709] First Optimizations of the Handel-C Program

[0710] The next development stage is to change some of the operatorsfamiliar in C to operators more suitable for Handel-C. In the aboveexample, every time the Source or Dest RAM is accessed, a multiplicationis made by the constant WIDTH. The Handel-C optimizer simplifies this toa shift left by 8 bits but one could easily do this by hand to reflectthe hardware more accurately and reduce compilation times. New macrosmay also be introduced to access the RAMs given x and y co-ordinates:macro expr ReadRAM(a, b) = ((unsigned 1)0) @ Source[(0@a) + ((0@b) <<8)]; macro proc WriteRAM(a, b, c) Dest[(0@a) + ((0@b)<<8)] = c;

[0711] Notice how the macros pad both the result and the co-ordinateexpressions with zeros. This allows one to reduce the width of the x andy counters to 8 bits each and reduces clutter in the rest of theprogram. This width reduction does mean that the loop conditions may bealtered because x and y are no longer wide enough to hold the constant256. Instead, one could test against zero since the counters may wrapround to zero after 255.

[0712] The modified edge_detect function is shown below: Example codeversion 2 void edge_detect( ) { unsigned LOG2_WIDTH x; unsignedLOG2_HEIGHT y; int 9 Pixel1, Pixel2, Pixel3; /* * Loop round for everypixel */ for (y=1; y!=0; y++) { for (x=1; x!=0; x++) {Pixel1=(int)ReadRAM(x, y); Pixel2=(int)ReadRAM(x−1, y);Pixel3=(int)ReadRAM(x, y−1); /* * Determine whether there is an edgehere */ if (abs(Pixel1 − Pixel2) > THRESHOLD || abs(Pixel1 − Pixel3) >THRESHOLD) WriteRAM(x, y, 0xFF); else WriteRAM(x, y, 0); } }

[0713] To execute this version of the Handel-C code:

[0714] 1. Make the version 2 project current within the ExampleCworkspace by selecting Project>Set Active Project>Edge_v2:

[0715] 2. Build and run the project by selecting Build>Build Edge_v2followed by F5.

[0716] 3. Convert the output from the edge detector back to a BMP fileusing the raw2bmp utility by opening a Command Prompt or MS-DOS window.Change to the Version 2 project directory and type: raw2bmp 256 dest.datdest_v2.bmp 8bppsrc.rgb

[0717] Adding Fine Grain Parallelism

[0718] There are two areas in this program that can be modified toimprove performance. The first is to replace for loops with while loopsand the second solves the problem of multiple accesses to external RAMin single clock cycles.

[0719] The for loop expands into a while loop inside the compiler in thefollowing way: for ( Init; Test; Inc) Body; becomes: { Init; while (Test) { Body; Inc; } }

[0720] This is normally not efficient for hardware implementationbecause the Inc statement is executed sequentially after the loop bodywhen in most cases it could be executed in parallel. The solution is toexpand the for loops by hand and use the par statement to execute theincrement in parallel with one of the statements in the loop body.

[0721] The second optimization concerns the three statements required toread the three pixels from external RAM. Without the restriction onmultiple accesses to RAMs the loop body of the edge detector could beexecuted in a single cycle whereas our current program requires fourcycles, three of which access the RAM. What is needed is a modificationto eliminate as many of these RAM accesses as possible.

[0722] Since it is not possible to access the external RAM more thanonce in one clock cycle, the only way to improve this program is toaccess multiple RAMs in parallel. It should also be clear that thecurrent program accesses most locations in the external RAM three times.For example, when x is 34 and y is 56 the three pixels read are atco-ordinates (34,55), (33,56) and (34,56).

[0723] The first of these is also read when x is 34 and y is 55 and whenx is 35 and y is 55 whereas the second is also read when x is 33 and yis 56 and when x is 33 and y is 57. If one can devise a scheme wherebypixels are stored in two extra RAMs when they are read from the mainexternal RAM for the first time then they could simply access theseadditional RAMs to get pixel values in the main loop.

[0724] The first step is to store the previous line of the image in aninternal RAM on the FPGA. This allows the pixel above the currentlocation to be read at the same time as the external RAM is accessed.The second step is to store the pixel to the left of the currentlocation in a register. The loop body then looks something like this:Pixel1 = ReadRAM(x, y); Pixel2 = PixelLeft; Pixel3 = LineAbove[x];LineAbove[x] = Pixel1; PixelLeft = Pixel1;

[0725] At first glance, it looks like things have been worse byincreasing the number of clock cycles but one can now add parallelism tomake it look like this: par { Pixel1 = (int)ReadRAM(x, y); Pixel2 =PixelLeft; Pixel3 = (int)LineAbove[x]; } par { LineAbove[x] = Pixel1;PixelLeft = Pixel1; }

[0726] Note the LineAbove RAM may be initialized at the start of theimage to contain the first line of the image and the PixelLeft variablemay be initialized at the start of each line with the left hand pixel onthat line. Since the second of these par statements and the if statementare not dependent on each other they can be executed in parallel.Putting all these modifications together gives an edge_detect procedureshown below.

[0727] Notice that the increment of y has been moved from the end of theloop to the start and the start and end values have been adjustedaccordingly. This allows the increment to be executed without additionalclock cycles which would have been required if it were placed at the endof the loop.

[0728] To execute this version of the Handel-C code:

[0729] 1. Make the version 3 project current within the ExampleCworkspace by selecting Project>Set Active Project>Edge_v3;

[0730] 2. Build and run the project by selecting Build>Build Edge_v3followed by F5.

[0731] 3. Convert the output from the edge detector back to a BMP fileusing the raw2bmp utility by opening a Command Prompt or MS-DOS window.

[0732] Change to the Version 3 project directory and type: raw2bmp 256dest.dat dest v3.bmp 8bppsrc.rgb Example code version 3 void edge_detect( ) { unsigned LOG2_WIDTH x; unsigned LOG2_HEIGHT y; int 9 Pixel1,Pixel2, Pixel3, PixelLeft; ram LineAbove[ ]; /* * Initialise theLineAbove RAM */ x = 1; while (x!=0) { par { LineAbove[x] = ReadRAM(x,(unsigned LOG2_HEIGHT)0); x++; } } /* * Loop for every line */ y = 0;while (y!=255) { /* * Initialise the PixelLeft register */ par { x = 1;PixelLeft = (int)ReadRAM((unsigned LOG2_WIDTH)0, y+1); y++; } /* * Loopfor every column */ while (x != 0) { /* * Update pixel registers */ par{ Pixel1 = (int)ReadRAM(x, y); Pixel2 = PixelLeft; Pixel3 =(int)LineAbove[x]; } /* * Determine whether there is an edge here */ par{ LineAbove[x] = (unsigned)Pixel1; PixelLeft = Pixel1; if (abs(Pixel1 −Pixel2) > THRESHOLD || abs(Pixel1 − Pixel3) > THRESHOLD) WriteRAM(x, y,0xFF); else WriteRAM(x, y, 0); x++; } }

[0733] Further Fine Grain Parallelism

[0734] The core loop body has now been reduced from five clock cycles(including the loop increment) to 2 clock cycles. One can even do betterbecause one should be able to access the two off-chip banks of RAM inparallel. Thus, the two parallel statements in the loop body could beexecuted simultaneously if one could organize the data flow correctly.

[0735] The program has been modified because the LineAbove internal RAMis accessed in both clock cycles. Paralleling the two statements is notpermitted because it would involve two accesses to the same internal RAMin a single clock cycle. The solution is to increase the number ofinternal RAMs. The current line can be copied into one internal RAMwhile the previous line is read from a second internal RAM. The twointernal RAM banks can then be swapped for the next line.

[0736] By also removing the Pixel1, Pixel2 and Pixel3 intermediatevariables, the two statements in the loop body can be rolled into one. Aperson may use the LSB of the y coordinate to determine which linebuffer to read from and which line buffer to write to. The external RAMread is done using a shared expression (RAMPixel) since one needs thevalue from the RAM in multiple places but only want to perform theactual read once.

[0737] The new version of the edge detector is shown below. The coreloop is now only one clock cycle long and is executed 255 times perline. One extra clock cycle is required per line for the initializationof variables and 255 lines are processed. In addition, 255 cycles arerequired to initialize the on-chip RAM and one extra clock cycle perframe is required for variable initialization. This gives a grand totalof 65536 clock cycles per frame or an average of exactly one pixel perclock cycle. Since there is no way of getting the image into or theresults out from the FPGA any faster than this one can conclude that thefastest possible solution to our problem has been reached. Example codeversion 4 void edge_detect( ) { unsigned LOG2_WIDTH x; unsignedLOG2_HEIGHT y; int 9 PixelLeft; ram unsigned char LineAbove0[ ],LineAbove1[ ]; unsigned 5 i; /* * Initialise the x and y counters andthe LineAbove RAM */ par { x = 1; y = 0; } while (x!=0) { par {LineAbove0[x] = ReadRAM(x, (unsigned LOG2_HEIGHT)0)<−8; x++; } } /* *Loop for every line */ while (y!=255) { /* * Initialise the PixelLeftregister */ par { x = 1; PixelLeft = (int)ReadRAM((unsignedLOG2_WIDTH)0, y+1); y++; } /* * Loop for every column */ while (x != 0){ par { shared expr RAMPixel = (int)ReadRAM(x, y); shared exprPixelAbove = (int) (y[0]==0 ? 0@LineAbove0[x] : 0@LineAbove1[x]); macroexpr abs(a) = (a<0 ? −a : a); /* * Update pixel registers */ if(y[0]==1) LineAbove0[x] = (unsigned) (RAMPixel<−8); else LineAbove1[x] =(unsigned) (RAMPixel<−8); PixelLeft = RAMPixel; /* * Determine whetherthere is an edge here */ if (abs(RAMPixel−PixelLeft) > THRESHOLD || abs(RAMPixel−PixelAbove) > THRESHOLD) WriteRAM(x, y, 0xFF); elseWriteRAM(x, y, 0); x++; } } } }

[0738] To execute this version of the Handel-C code:

[0739] 1. Make the version 4 project current within the ExampleCworkspace by selecting Project>Set Active Project>Edge_v4:

[0740] 2. Build and run the project by selecting Build>Build Edge_v4followed by F5

[0741] 3. Convert the output from the edge detector back to a BMP fileusing the raw2bmp utility by opening a Command Prompt or MS-DOS window.Change to the Version 4 project directory and type: raw2bmp 256 dest.datdest_v4.bmp 8bppsrc.rgb

[0742] Adding the Hardware Interfaces

[0743] Once the program has been simulated correctly one may add thenecessary hardware interfaces. The interface with the host requires thesame signals and timings as the example set out hereinafter. The codewill now be taken from that example and used to produce two macroprocedures—one to read a word from the host and one to write a word tothe host. (These could also be implemented as functions) The suitablymodified code looks like this: // Read word from host macro procReadWord(Reg) { while (ReadReady == 0); Read = 1; // Set the read strobepar { Reg = dataB.in; // Read the bus Read = 0; // Clear the read strobe} } // Write one word back to host macro proc WriteWord(Expr) { par {while (WriteReady == 0); dataBOut = Expr; } par { En = 1; // Drive thebus Write = 1; // Set the write strobe } Write = 0; // Clear the writestrobe En = 0; // Stop driving the bus }

[0744] One also needs to define the pins for the external RAMs andremove the RAM declarations added to simulate the RAMs. The main programalso needs to be modified to include the code to synchronies the framegrabber with the edge detector. The project settings need to be changedin the GUI. Set the configuration to VHDL or EDIF. This code is notdesigned for a specific device. One would need to know the appropriatepins for the device one are targeting. The pin definitions given areexamples only and do not reflect the actual pins available on anyparticular device. The code excluding the edge detection and hostinterface macros is given below. #define LOG2_WIDTH 8 #define WIDTH 256#define LOG2_HEIGHT 8 #define HEIGHT 256 set clock = external “P1”;unsigned 8 Threshold; // External RAM definitions/declarations ramunsigned 8 Source[65536] with { offchip = 1, data = {“P1”, “P2”, “P3”,“P4”, “P5”, “P6”, “P7”, “P8”}, addr = {“P9”, “P10”, “P11”, “P12”, “P13”,“P14”, “P15”, “P16”, “P17”, “P18”, “P19”, “P20”, “P21”, “P22”, “P23”,“P24”}, we = {“P25”}, oe = {“P26”}, cs = {“P27”}}; ram unsigned 8Dest[65536] with { offchip = 1, data = {“P28”, “P29”, “P30”, “P31”,“P32”, “P33”, “P34”, “P35”}, addr = {“P36”, “P37”, “P38”, “P39”, “P40”,“P41”, “P43”, “P44”, “P45”, “P46”, “P47”, “P48”, “P49”, “P50”, “P51”},we = {“P52”}, oe = {“P53”}, cs = {“54”}}; macro expr ReadRAM(a, b) =((unsigned 1)0) @ Source[(0@a) + ((0@b) << 8)]; macro proc WriteRAM(a,b, c) Dest[0@a) + ((0@b)<<8)] = c; #ifndef SIMULATE // Host busdefinitions/declarations unsigned 8 dataBOut; int 1 En = 0; interfacebus_ts_clock_in(int 4) dataB(dataBOut, En==1) with {data = {“P55”,“P56”, “P57”, “P58”}}; int 1 Write = 0; interface bus_out( )writeB(Write) with {data = {“P59”}}; int 1 Read = 0; interface bus_out() readB(Read) with {data = {“P60”}}; interface bus_clock_in(int 1)WriteReady( ) with {data = {“P61”}}; interface bus_clock_in(int 1)ReadReady( ) with {data = {“P62”}}; #endif Insert edge_detect , ReadWordand WriteWord function and macro definitions here void main(void) {ReadWord(Threshold); while(1) { unsigned Dummy; ReadWord(Dummy);edge_detect( ); WriteWord(Dummy); }

[0745] Summary

[0746] The aim of this section has been to show the development of areal Handel-C program from conventional C to a full program targeted athardware. Is has also shown the performance benefits of the Handel-Capproach by demonstrating a real time application executing with a greatdeal of parallelism.

[0747] Targeting Hardware

[0748] Targeting Hardware via VHDL

[0749] If one is integrating Handel-C code with raw VHDL code, one wouldcompile the Handel-C for debug, and use ModelSim to compile the VHDL forsimulation. One could then compile the Handel-C to VHDL and use SimplifyLeonardoSpectrum or FPGA Express to synthesize the code. One would thenuse Xilinx or Altera tools to place and route it.

[0750] Linking to the Handel-C VHDL Library

[0751] The HandelC.vhdl file may be supplied which supports all Handel-CVHDL files. To use Handel-C VHDL, one may compile the HandelC.vhdl fileinto a library called HandelC. (Consult the documentation for thesynthesis or simulation tool on compiling library files.) A person alsoneeds to compile the supplied file ROC.vhdl into the work library forsimulation.

[0752] Connecting Handel-C EDIF to VHDL

[0753] If one compiles a Handel-C file to EDIF and wish to connect it toa VHDL, he or she may be aware that the ports in EDIF and VHDL aredifferent. EDIF ports consist of a collection of single wires. VHDLports are normally described as n-bit wide cables. To ensure that thegenerated EDIF can connect to the VHDL, the VHDL ports may be listed assingle-bit wires.

[0754] VHDL Component within Handel-C Project

[0755] Handel-C Code set clock = external “D17”; unsigned 4 x; interfacevhdl_component(unsigned 4 return_val) vhdl_component_instance(unsigned 1clk = _clock, unsigned 4 sent_value = x); etc . . . unsigned 4 y; y =vhdl_component_instance; // Read from VHDL component x = y; // Write toVHDL component

[0756] VHDL Code

[0757] The VHDL entity may need an interface like this to be compatiblewith the Handel-C. entity vhdl_component is port ( clk : in std_logic;sent_value_0 : in std_logic; sent_value_1 : in std_logic; sent_value_2 :in std_logic; sent_value_3 : in std_logic; return_val_0 : out std_logic;return_val_1 : out std_logic; return_val_2 : out std_logic; return_val_3: out std_logic ); end;

[0758] Note that all the ports are 1-bit wide, standard_logic types.This is because when the Handel-C is compiled to EDIF, this is how theexpanded interface appears. (EDIF cannot represent n-bit wide cables,only single wires).

[0759] Handel-C Component within VHDL Project

[0760] The Handel-C needs to have ports to its top level, so that theVHDL can connect to them. unsigned 4 x; interface port_in(unsigned 1clk) ClockPort( ); interface port_in(unsigned 4 sent_value) InPort( );interface port_out( ) OutPort(unsigned 4 return_value = x); set clock =internal ClockPort.clk; etc . . . unsigned 4 y; y = InPort.sent_value;// Read from top-level VHDL x = y; // Write to top-level VHDL VHDL codeThe top level VHDL may need to instantiate the Handel-C like this:component handelc_component port ( clk : out std_logic; sent_value_0 :out std_logic; sent_value_1 : out std_logic; sent_value_2 : outstd_logic; sent_value_3 : out std_logic; return_val_0 : in std_logic;return_val_1 : in std_logic; return_val_2 : in std_logic; return_val_3 :in std_logic ); end component;.

[0761] Targeting Hardware via EDIF

[0762] To target hardware via EDIF, one may set up the project to targetEDIF using the Build>Set Active Configuration command. This compilesdirectly to an .edf file which can be passed to the place and routetools.

[0763] Port Renaming for Debug

[0764] To aid in debugging the generated EDIF, one can rename the EDIFnets within the net list such that the Handel-C declaration name appearsbefore the EDIF unique identifier.

[0765] To do so, select the Project>Settings . . . command. In theProject Settings dialog that opens, ensure that the EDIF is the type ofsettings that is being edited.

[0766] In the Compiler tab, check the Generate debug information box.

[0767] Setting Up Place and Route Tools

[0768]FIG. 35 illustrates a net list reader settings display 3500, inaccordance with one embodiment of the present invention. The Altera EDIFcompiler requires a library mapping file. This is supplied ashandelc.lmf.

[0769] Setting up MaxPlus II to Use handelc.lmf

[0770] Start MaxPlus II

[0771] Open MaxPlus II>Compiler

[0772] With the compiler selected, select Interfaces>EDIF Net listReader Settings.

[0773] In the dialog box, specify Vendor as Custom.

[0774] Click the Customize>>button (3502)

[0775] Select the LMF #1 radio button (3504). Set up the pathname (3506)for the handelc.lmf file.

[0776] (Installed in Handel-C installation root\lmf.)

[0777] Setting Up Quartus 2000 to Use handelc.lmf

[0778]FIGS. 36 and 37 illustrates a tool settings display 3600 and 3700,in accordance with one embodiment of the present invention.

[0779] Start Quartus.

[0780] Select the Project>EDA Tool Settings menu command.

[0781] In the dialog box, use the pull-down list to set Custom as theDesign entry/synthesis tool.

[0782] Click Settings. (3602) (Note FIG. 37.)

[0783] Set the File name 3702 for the Library Mapping File, click the .. . button to browse for handelc.lmf. (Installed in Handel-Cinstallation root\lmf.)

[0784] Setting Up Wire Names

[0785] One can specify the format of floating wire names in EDIF usingthe Handel-C bus format specification. This allows one to use theformats B1 B_(—)1 B[1] B(1)

[0786] where B represents the bus name, and 1 the wire number. interfaceport_in(int 4 signals_to_HC with {busformat=“B[1]) read( );

[0787]FIG. 38 illustrates the wires 3800 that would be produced whenspecifying floating wire names, in accordance with one embodiment of thepresent invention.

[0788] Connecting to VHDL Blocks

[0789] Requirements

[0790] If one wishes to connect Handel-C code to VHDL blocks andsimulate the results, one may require the following objects:

[0791] A VHDL simulator (currently ModelSim)

[0792] The cosimulator plugin (e.g. PlugInModelSim.dll) to allow theVHDL simulator to work in parallel with the Handel-C simulator. Thisfile is provided with the copy of Handel-C

[0793] The file plugin.vhdl to connect the VHDL to the cosimulatorplugin. This file is included with the copy of Handel-C

[0794] A VHDL wrapper file to connect the VHDL entity ports to theHandel-C simulator and to VHDL dummy signals. (One may write this)

[0795] The VHDL entity and architecture files (one may provide or writethese)

[0796] A Handel-C code file that includes an interface definition in theHandel-C code to connect it to the VHDL code. (One may write this.)

[0797] Simulation Requirements

[0798] Before one can simulate the code he or she may:

[0799] 1. Set up ModelSim so that the work library refers to the librarycontaining this wrapper component.

[0800] 2. Check that the plugin has been installed in the same place asthe other Handel-C components. If one has moved it, he or she may ensurethat its new location is on the PATH.

[0801] 3. Compile the VHDL model to be integrated with Handel-C into theVHDL simulator. 4. Compile plugin.vhdl.

[0802] 5. Compile the wrapper.

[0803] 6. Compile the Handel-C code and run the Handel-C simulator. Thismay invoke any VHDL simulations required.

[0804] Batch Files

[0805] Sample batch files that carry out these tasks have been suppliedwith the examples:

[0806] handelc_vhdl.bat Sets up environment variables for ModelSim. Runonce before first co-simulating

[0807] reg32xlk_vhdl.bat Compiles all the components for the registerexample. Run once before co-simulating the example. Run again if theVHDL code is changed

[0808] ttl7446_vhdl.bat Compiles all components for the combinatoriallogic example. Run before co-simulating and if the VHDL code is changed.

[0809]FIG. 39 illustrates an interface 3900 in the form of a plug-in3902 between Handel-C 3904 and VHDL 3906 for simulation, in accordancewith one embodiment of the present invention.

[0810] Place and Route Requirements

[0811] If one wishes to compile the Handel-C code and VHDL blocks andplace and route the results, he or she may need to:

[0812] Compile the Handel-C code to VHDL.

[0813] Pass the compiled Handel-C and the VHDL model to an RTL synthesistool (such as FPGAExpress).

[0814] Run the place and route.

[0815] Writing Handel-C to Communicate with VHDL

[0816] The code needed in the Handel-C program is in two parts. First,one needs an interface declaration. This prototypes the interface sortand is of the format:

[0817] Interface

[0818] VHDL_entity_sort (VHDL_to_HC_port

[0819] {,VHDL_to_HC_port})

[0820] (VHDL_from_HC_port

[0821] {, VHDL_from_HC_port});

[0822] where:

[0823] VHDL_entity_sort is the name of the VHDL entity. This name may beused as the interface sort.

[0824] VHDL_to_HC_port is the type and name of a port bringing data tothe Handel-C code (output from VHDL) precisely as specified in theunwrapped VHDL entity

[0825] VHDL_from_HC port is the type and name of a port sending datafrom the Handel-C code (input to VHDL) precisely as specified in theunwrapped VHDL entity.

[0826] Note that ports are seen from the VHDL side, so port names may beconfusing. In Handel-C, the ports that input data TO the Handel-C may bespecified first.

[0827] One then needs an interface definition. This creates an instanceof that interface sort, and defines the data that may be transmitted.This is of the format:

[0828] Interface

[0829] VHDL_entity_sort (VHDL_to_HC_port [with portSpec]

[0830] {, VHDL_to_HC_port [with portSpec]})

[0831] interface_Name (VHDL_from_HC_data = from_HC_data

[0832] [with portSpec]

[0833] {, VHDL_from_HC_data = from_HC_data

[0834] [with portSpec]})

[0835] with {extlib=“PluginModelSim.dll”,

[0836] extinst=“instanceName; model=entity_wrapper;

[0837] clock=clockName:period; delay=units”};

[0838] where:

[0839] VHDL_entity_sort is the interface sort that one previouslydeclared.

[0840] VHDL_to_HC_port is the type and name of a port bringing data tothe

[0841] Handel-C code (output from VHDL). This may have the same type asdefined in the interface declaration

[0842] interface_Name is the name for this instance of the interface.

[0843] VHDL_from_HC_port is the type and name of a port sending datafrom the Handel-C code (input to VHDL). This may have the same type asdefined in the interface declaration

[0844] VHDL_from HC_data is an expression that is output from theHandel-C to the VHDL.

[0845] with portSpec is an optional port specification. FIGS. 40A and40B illustrate a table of possible specifications 4000, in accordancewith one embodiment of the present invention. The with list after theport listings gives the specifications for all the ports on theinstance. These general specifications may be overruled by anyindividual port specifications.

[0846] extlib=“PluginModelSim.dll” specifies the cosimulator used. Theextinst string gives the parameters to the cosimulator plugin. Theparameters for PluginModelSim.dll are as follows:

[0847] instanceName is a unique name representing that instance of theVHDL entity. It is recommended that this is the same as theinterface_Name.

[0848] entity_wrapper is the name of the VHDL wrapper component.

[0849] clock=clockName: period is only needed in clocked circuits. Itdefines the port and period of the clock input to the VHDL fromHandel-C. clockName is the name of the port that carries the clocksignal. period is the number of simulator time units per clock tick. Thesimulation time in ModelSim is advanced by this time delay every clockcycle.

[0850] delay=units is optional. It gives the combinational delay to beused by the simulator to allow a combinational input to propagate to anoutput. For zero delay models as used in RTL synthesis, a single timeunit is all that is required. The default value is 1.

[0851] ModelSim may be automatically started when the Handel-C model isrun and may be automatically closed when the Handel-C model is closed.Error messages relating to the VHDL model may appear in the ModelSimmessage window, but may also be reflected back to the Handel-C debugwindow.

[0852] Clocked Circuit Simulation

[0853] The simulator time units are determined by ModelSim'spreferences, which may be found in a modelsim.ini file in the localdirectory. (It is created on first use of the simulator in anydirectory—one can then edit it to modify the settings). The default timeunit is ns. If one has the values: clock=ck:25; delay=1. Clock risingedges may occur at 25 ns, 50 ns, 75 ns and the outputs may be sampled at26 ns, 51 ns, 76 ns and so on. Clocks are assumed to have equalmark:space ratios. However, ModelSim can only deal with delays that areintegral multiples of the time unit. If the period is odd (as in thiscase), the high time may be shorter than the low time, so in this casethe clock may have a 12:13 ratio.

[0854] Interfacing the VHDL with the Handel-C Simulator

[0855]FIG. 41 illustrates the use of various VHDL files 4100, inaccordance with one embodiment of the present invention. One needs toprovide a wrapper file 4102 for VHDL code 4104. The wrapper file wrapsthe VHDL code, connecting the entity ports to dummy signals and providesthe interface to the Handel-C simulator plugin 4106. The wrapper code isonly required in the simulation phase, not in the synthesis phase. Thefollowing information assumes that one has two VHDL files, the objectcode for the architecture file (entity_architecture.vhdl) and the sourcecode for the interface to the behavior file (entity.vhdl).

[0856] One needs to examine the ports defined in the entity file, andensure that each port is connected to a signal in a wrapper file. Asample wrapper file is provided. It assumes that the plugin, entity andwrapper file have all been compiled to the default work library. entityname_wrapper is end; —do standard library stuff library ieee; useieee.std_logic_1164.all; use ieee.std_logic_arith.all; architecturetop_level of name_wrapper is signal name : type; (repeat as necessary)begin pluginName: entity work.plugin; —connect to Handel-C linkentityName: entity work.entity port map (signal_names); end;

[0857] To use the file, replace the sections in italics as followsname_wrapper replace with the appropriate wrapper name.

[0858] entity_wrapper is recommended.

[0859] signal name: type; replace with a list of dummy signals thatconnect to the entity ports for compilation purposes. These signals canhave any name, but the format and order of the ports may be exactly asspecified in the VHDL

[0860] pluginName is a user-defined name for that instance of the pluginthat connects the signals through the simulators

[0861] entityName is a user-defined name for that instance of the entity

[0862] signalNames is a comma-separated list of the dummy signals.

[0863] Note that a limited number of port types are supported:

[0864] 1-bit types in Handel-C may be implemented by std_logic

[0865] n-bit unsigned and signed types in Handel-C may be implemented bystd_logic_arith.unsigned

[0866] No other types may be used. If the circuit uses other types onemay need to create another VHDL wrapper containing type conversions tothese three types between the plugin wrapper and the circuit to beintegrated.

[0867] Example

[0868] The following example shows the code for a trivial VHDL entityfile simple.vhdl. This describes the interface for the simplearchitecture. library ieee; use ieee.std_logic_1164.all; useieee.std_logic_arith.all; entity simple is port (input : in unsigned(63downto 0); output : out unsigned(63 downto 0); simtime : out unsigned(31downto 0)); end; architecture behavior of simple is begin process(input)begin output <= conv_unsigned(input*conv_unsigned(2,input'length),output'length); simtime <= conv_unsigned(input, 32); end process; end;

[0869] This shows the code for the wrapper file for simple.vhdl. Thisfile would be called simple_wrapper.vhdl. entity simple_wrapper is end;library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all;architecture top_level of simple_wrapper is signal x : unsigned (63downto 0); signal z : unsigned (63 downto 0); signal t : unsigned (31downto 0); begin plugin1: entity work.plugin; simple1: entitywork.simple port map (x, z, t); end;

[0870] Handel-C Code

[0871] This is the interface to the VHDL. Note that the interface sortis simple and the port names are identical to the input and outputentity names in the VHDL. set clock = external “P1”; unsigned 64 x;interface simple(int 64 output, int 32 simtime) t1(int 64 input = x)with {extlib=“PluginModelSim.dll”,extinst=“simple_wrapper“}; voidmain(void) { unsigned 64 y; unsigned 32 now; x = 1; while(1) { par { y =t1.output; // set y to the vhdl output now = t1.simtime; // and now tosimtime } x = y; }

[0872] Compiling and Simulating the Examples

[0873] These examples are installed in the subdirectoryHandel-C\Examples\VHDL. There are two projects. Example1 contains thecombinatorial circuit and Example2 contains the registers example.Supplied with each example is a batch file that compiles the VHDL forModelSim. To run the examples one may set up ModelSim for interfacingwith Handel-C, compile the VHDL and compile the Handel-C.

[0874] Setting up ModelSim

[0875] Go to the project directory and double click on the batch filehandelc_vhdl.bat to run it. This only has to be done once.

[0876] Compiling the VHDL

[0877] Double click the appropriate project batch file (ttl7446_vhdl.batfor the combinatorial logic project and reg32xlk_vhdl.bat for theregisters project). This compiles the VHDL. If one changes the VHDLcode, he or she may need to recompile it.

[0878] Compiling and Simulating the Handel-C

[0879] Double-click on the workspace file (e.g. Example1.hw) to startHandel-C. Click the Build button or select Build>Build to compile andbuild the example. Once it has built, select Debug>Go or Debug>Step Intoto start the simulation.Connecting to VHDL blocks

[0880] A Simple Combinatorial Circuit Example

[0881] The VHDL Code

[0882] The VHDL code for the combinatorial circuit is in the filett17446.vhdl library ieee; use ieee.std_logic_1164.all; useieee.std_logic_arith.all; entity TTL7446 is port (ltn : in std_logic;rbin : in std_logic; digit : in unsigned(3 downto 0); bin : instd_logic; segments : out unsigned(0 to 6); rbon : out std_logic); end;architecture behavior of TTL7446 is begin process(ltn, rbin, bin, digit)begin rbon <= ‘1’; if bin = ‘0’ then segments <= “1111111”; elsif ltn =‘0’ then segments <= “1000000”; else case digit is when “0000” =>segments <= “0000001”; if rbin = ‘0’ then segments <= “1111111”; rbon <=‘0’; end if; when “0001” => segments <= “1001111”; when “0010” =>segments <= “0010010”; when “0011” => segments <= “0000110”; when “0100”=> segments <= “1001100”; when “0101” => segments <= “0100100”; when“0110” => segments <= “1100000”; when “0111” => segments <= “0001111”;when “1000” => segments <= “0000000”; when “1001” => segments <=“0001100”; when “1010” => segments <= “1110010”; when “1011” => segments<= “1100110”; when “1100” => segments <= “1011100”; when “1101” =>segments <= “0110100”; when “1110” => segments <= “1110000”; when “1111”=> segments <= “1111111”; when others => segments <= “XXXXXXX”; endcase; end if;.

[0883] A Sample Wrapper for the Combinatorial Circuit

[0884] The VHDL wrapper code for the combinatorial circuit is in thefile ttl7446 wrapper.vhdi entity TTL7446_wrapper is end; library ieee;use ieee.std_logic_1164.all; use ieee. std_logic_arith.all; architectureHandelC of TTL7446_wrapper is signal ltn : std_logic; signal rbin :std_logic; signal digit : unsigned(3 downto 0); signal bin : std_logic;signal segments : unsigned(0 to 6); signal rbon : std_logic; beginplugin1: entity work.plugin; ttl: entity work.TTL7446 port map (ltn,rbin, digit, bin, segments, rbon); end;

[0885] This shows the two instances. It also shows each port of thecircuit to be integrated connected to a signal which is not connected toanything else. This is not a requirement of the plugin, but arequirement of VHDL. Note that VHDL'93 features have been used to createdirect instantiations of the components.

[0886] Example Handel-C Using the Combinatorial Circuit

[0887] This is in the file ttl7446_test.c // Set chip details set clock= external “D17”; set part = “V1000BG560-4”; // Interface declarationinterface TTL7446(unsigned 7 segments, unsigned 1 rbon) (unsigned 1 ltn,unsigned 1 rbin, unsigned 4 digit, unsigned 1 bin); // Main program voidmain(void) { unsigned 1 ltnVal; unsigned 1 rbinVal; unsigned 1 binVal;unsigned 4 digitVal;.Connecting to VHDL blocks unsigned 1 rbonVal;unsigned 20 Delay; interface TTL7446(unsigned 7 segments, unsigned 1rbon) decode(unsigned 1 ltn=ltnVal, unsigned 1 rbin=rbinVal, unsigned 4digit=digitVal, unsigned 1 bin=binVal) with{extlib=“PluginModelSim.dll”, extinst=“decode; model=TTL7446_wrapper;delay=1”}; interface bus_out( ) display(unsigned display =˜decode.segments) with {extlib=“7segment.dll”, extinst=“0”,data={“AN28”, “AK25”, “AL26”, “AJ24”, “AM27”, “AM26”, “AK24”}}; par {ltnVal = 0; rbinVal = 0; binVal = 0; digitVal = 0; } while(1) { binVal =1; ltnVal = 1; do { do { rbonVal = decode.rbon; digitVal++; #ifndefSIMULATE do { Delay++; } while(Delay!=0); #endif } while (digitVal !=0); rbinVal++; } while (rbinVal != 0); }

[0888] Interface Code

[0889] One may declare an interface sort that has port names of the samename and type as the VHDL signals in the circuit to be integrated. Theinterface sort may be the same as the VHDL model's name.

[0890] interface TTL7446(unsigned 7 segments, unsigned 1 rbon) (unsigned1 ltn, unsigned 1 rbin, unsigned 4 digit, unsigned 1 bin);

[0891] An instance of this component is then created one or more timesin the Handel-C code. An example of an instantiation is:

[0892] interface TTL7446(unsigned 7 segments, unsigned 1 rbon)decode(unsigned 1 ltn=ltnval, unsigned 1 rbin=rbinval, unsigned 4digit=digitval, unsigned 1 bin=binval) with{extlib=“PluginModelSim.dll”, extinst=“decode; model=ttl7446 wrapper;delay=1”};.Connecting to VHDL blocks

[0893] Simple Register Bank Circuit Example

[0894] VHDL Code

[0895] The VHDL code for the register bank circuit is in the filereg32x1k.vhdl library ieee; use ieee.std_logic_1164.all; useieee.std_logic_arith.all; entity reg32x1k is —simple synchronousregister bank, 32 bits wide and 1k registers deep port(address : inunsigned(9 downto 0); data_in : in unsigned(31 downto 0); ck : instd_logic; write : in std_logic; data_out : out unsigned(31 downto 0));end; architecture behavior of reg32x1k is type register_array is array(natural range <>) of unsigned(31 downto 0); signal data :register_array(0 to 1023) := (others => (others => ‘0’)); begin process(ck) begin if ck'event and ck = ‘1’ then if write = ‘1’ thendata(conv_integer(address)) <= data_in; end if; end if; end process;data_out <= data(conv_integer(address)); end;

[0896] VHDL Wrapper for Registers Example

[0897] This is the file reg32x1k_wrapper.vhdl. entity reg32x1k_wrapperis end; library ieee; use ieee.std_logic_1164.all; useieee.std_logic_arith.all; architecture HandelC of reg32x1k_wrapper issignal address : unsigned(9 downto 0) := (others => ‘0’); signal data_in: unsigned(31 downto 0) := (others => ‘0’); signal ck : std_logic :=‘0’; signal write : std_logic := ‘0’; signal data_out : unsigned(31downto 0) := (others => ‘0’); begin plugin1: entity work.plugin;registers: entity work.reg32x1k port map (address, data_in, ck, write,data_out); end;

[0898] Handel-C Code to Interface with Registers

[0899] This is the file reg32x1k_test.c. // Set chip details set clock =external “D17”; set part = “V1000BG560-4”; // Interface declarationinterface reg32x1k(unsigned 32 data_out) (unsigned 10 address, unsigned32 data_in, unsigned 1 ck, unsigned 1 write); // Main program voidmain(void) { unsigned 32 data_outVal; unsigned 10 addressVal; unsigned32 data_inVal; unsigned 1 writeVal; interface reg32x1k(unsigned 32data_out) registers(unsigned 10 address = addressVal with {extpath ={data_out}}, unsigned 32 data_in = data_inVal, unsigned 1 ck = _clock,unsigned 1 write = writeVal) with {extlib=“PluginModelSim.dll”,extinst=“1; model=reg32x1k_wrapper; clock=ck:25”};.Connecting to VHDLblocks par { addressVal = 0; data_inVal = 0; writeVal = 0; } while(1) {par { writeVal = 1; addressVal = 0; } do { par { addressVal++;data_inVal += 10; } data_outVal = registers.data_out; } while(addressVal < 10); par { writeVal = 0; addressVal = 0; } do {addressVal++; data_outVal = registers.data_out; } while (addressVal <10); }

[0900] Application Programmers Interface

[0901]FIG. 41A illustrates a method 4150 for equipping a simulator withplug-ins. In general, in operation 4152, a first simulator written in afirst programming language is executed for generating a first model.Further, in operation 4154, a second simulator written in a secondprogramming language is executed to generate a second model. In oneaspect, the first simulator may be cycle-based and the second simulatormay be event-based. More information on such types of simulators will beset forth hereinafter in greater detail during reference to FIG. 44A.

[0902] By this design, a co-simulation may be performed utilizing thefirst model and the second model. See operation 4156. In one aspect ofthe present invention, the accuracy and speed of the co-simulation maybe user-specified. In another aspect, the co-simulation may includeinterleaved scheduling.

[0903] In an additional aspect of the present invention, theco-simulation may include fully propagated scheduling. In a furtheraspect, the simulations may be executed utilizing a plurality ofprocessors (i.e. a co-processor system). In even another aspect, thefirst simulator may be executed ahead of or behind the second simulator.In yet an additional aspect, the first simulator may interface with thesecond simulator via a plug-in. More information regarding suchalternate embodiments will be set forth hereinafter in greater detail.

[0904] The Application Programmers Interface (API) thus describes how towrite plugins to connect to the Handel-C simulator. Plugins are programsthat run on the PC and connect to a Handel-C clock or interface. Theycan be written in any language.

[0905] Examples of useful plugins are:

[0906] Simulated oscilloscope

[0907] Simulated wave-form generators

[0908] Selected display and storage of variables for debugging

[0909] Co-simulation of other circuits

[0910] Data Widths in the Simulator

[0911] The simulator uses 32-bit, 64-bit or arbitrary width arithmeticas appropriate. The interface to the simulator uses pointers to valuesof defined widths. Where 32 bit or 64 bit widths are used, data isstored in the most signif cant bits.

[0912] Simulator Interface

[0913] The plugin is identified to the simulator by:

[0914] the name of the compiled .dll (the compiled plugin)

[0915] the function calls that pass data between the plugin and theHandel-C program

[0916] the instance name

[0917] These are passed to the simulator using the with specifications

[0918] extlib Specifies the name of the DLL. No default.

[0919] extinst Specifies an instance string. No default.

[0920] extfunc Specifies the function to call to pass data to the pluginor get data from the plugin. Defaults to PlugInSet( ) for passing datato the plugin and PlugInGet( ) to get data from the plugin.

[0921] The simulator expects the plugin to support various functioncalls and some data structures. The simulator also has functions thatcan be called by the plugin (callback functions). These functions giveinformation about the state of variables in the Handel-C program. FIGS.42A and 42B illustrate various function calls 4200 and the various usesthereof, in accordance with one embodiment of the present invention.

[0922] Function Name Retention in C++

[0923] The simulator requires that the function names within the pluginare retained. Since C++ compilers may change function names one mayensure that the function names are identified as C types. To do so, onemay either compile the plugin as a C file, or, if he or she is compilingit as C++, he or she may use the extern extension to force the compilerto use the C naming convention. To compile the function as C++ place thestring extern “C ” immediately before the function definition to ensurethat the function names are exported as written, e.g. extern “C” dllvoid PlugInOpen(HCPLUGIN_INFO *Info, unsigned long NumInst) { //thisfunction intentionally left blank //intialising before the firstsimulation is run

[0924] Specifying Plugins in the Handel-C Source Code

[0925] Plugins are specified in the Handel-C source code using theextlib, extinst and extfunc specifications. These specifications may beapplied to clocks or interface definitions. For example:

[0926] set clock =external “P1”with {extlib=“plugin.dll”,extinst=“instance( )”}; In the case of interface definitions, thespecifications may be specified for individual ports or for theinterface as a whole. For example: interface bus_in(unsigned 4 Input)BusName( ) with {extlib=“plugin.dll”, extinst=“some instance string”,extfunc=“BusNameGetValue”}; interface bus_ts(unsigned 4 Input with{extlib=“plugin.dll”, extinst=“some instance string”,extfunc=“BusNameGetValue”}) BusName(unsigned 4 Output with{extlib=“plugin.dll”, extinst=“some instance string”,extfunc=“BusNameSetValue”}, unsigned 1 Enable with {extlib=“plugin.dll”,extinst=“some instance string”, extfunc=“BusNameEnable”});

[0927] Data Structures

[0928] Structure Passed on Startup

[0929] The following data structure passes essential information fromthe simulator to the plugin on startup.

[0930] HCPLUGIN_INFO typedef struct { unsigned long Size; void *State;HCPLUGIN_CALLBACKS CallBacks; } HCPLUGIN_INFO;

[0931] Members

[0932] Size Set to sizeof(HCPLUGIN_INFO) as a corruption check.

[0933] State Simulator identifier which may be used in callbacks fromthe plugin to the simulator. This value should be passed in future callsto any function in the CallBacks structure.

[0934] CallBacks Data structure containing pointers to the callbackfunctions from the plugin to the simulator. See below for details ofthese functions.

[0935] Callback Data Structure

[0936] HCPLUGIN_CALLBACKS

[0937] The pointers to the callback functions are contained in thefollowing structure, which is a member of the HCPLUGIN_INFO structurepassed to the PlugInOpen( ) function. Size should be set tosizeof(HCPLUGIN_CALLBACKS). typedef struct { unsigned long Size;HCPLUGIN_ERROR_FUNC PluginError; HCPLUGIN_GET_VALUE_COUNT_FUNCPluginGetValueCount; HCPLUGIN_GET_VALUE_FUNC PluginGetValue;HCPLUGIN_GET_MEMORY_ENTRY_FUNC PluginGetMemoryEntry; }HCPLUGIN_CALLBACKS;

[0938] Source File Position Structures

[0939] A source position consists of a list of individual source coderanges. Each range details the source file and a range of lines andcolumns. The list of ranges consists of a singly linked list of sourcecode ranges. Lists of positions are generated by some Handel-C sourcecode constructs. For example, a call to a macro proc produces positionsfor the body elements of the macro proc with two members of the positionrange list. One points to inside the macro proc body and the otherpoints to the call of the macro proc. Lists of positions are alsogenerated for replicators and arrays of functions. The following datastructures are used to represent source positions of objects:HCPLUGIN_POS_ITEM typedef struct HCPLUGIN_POS_ITEM_tag { unsigned longSize; char *FileName; long StartLine; long StartColumn; long EndLine;long EndColumn; struct HCPLUGIN_POS_ITEM_tag *Next; } HCPLUGIN_POS_ITEM;

[0940] Members

[0941] Size Set to sizeof(HCPLUGIN_POS_ITEM) as a corruption check.

[0942] FileName Source file name of position range.

[0943] StartLine First line of range.—1 indicates the filename is anobject file with no debug information. Line counts start from zero.

[0944] StartColumn First column of range.—1 indicates the filename is anobject file with no debug information. Column counts start from zero.

[0945] EndLine Last line of range.—1 indicates the filename is an objectfile with no debug information. Line counts start from zero.

[0946] EndColumn Last column of range.—1 indicates the filename is anobject file with no debug information. Column counts start from zero.

[0947] Next Pointer to next position range in list. NULL indicates thisis the last position range in the list. HCPLUGIN_POSITION typedef struct{ unsigned long Size; HCPLUGIN_POS_ITEM *SourcePos; } HCPLUGIN_POSITION

[0948] Members

[0949] Size Set to sizeof(HCPLUGIN_POSITION) as a corruption check.

[0950] SourcePos Pointer to first position range in the linked list.

[0951] Variable Value Structures

[0952] The following data structure is used to pass information onvariable values from the simulator to the plugin. The plugin can queryand set the values of variables in the simulator using these datastructures and the associated callback functions of typesHCPLUGIN_GET_VALUE_FUNC and HCPLUGIN_GET_MEMORY_ENTRY_FUNC. Values areaccessed via an index using these functions. See below for furtherdetails of these functions.

[0953] HCPLUGIN VALUE

[0954] typedef enum

[0955] {

[0956] HCPluginValue,

[0957] HCPluginArray,

[0958] HCPluginStruct,

[0959] HCPluginRAM,

[0960] HCPluginROM,

[0961] HCPluginWOM,

[0962] } HCPLUGIN_VALUE_TYPE;

[0963] The HCPLUGIN_VALUE_TYPE enumerated type is used to define thetype of object value contained in the HCPLUGIN_VALUE data structure. Thevalues have the following meanings:

[0964] HCPluginValue General value used for registers and signals.

[0965] Data.ValueData member of the HCPLUGIN_VALUE structure should beused.

[0966] HCPluginArray Array value. Data structure contains a list ofvalue indices in the Data.ArrayData member of the HCPLUGIN_VALUEstructure.

[0967] HCPluginStruct Structure value. Data structure contains a linkedlist of values in the Data.StructData member of the HCPLUGIN_VALUEstructure.

[0968] HCPluginRAM RAM memory value. Data structure contains the numberof entries in the memory in the Data.MemoryData member ofHCPLUGIN_VALUE.

[0969] HCPluginROM ROM memory value. Data structure contains the numberof entries in the memory in the Data.MemoryData member ofHCPLUGIN_VALUE.

[0970] HCPluginWOM WOM memory value. Data structure contains the numberof entries in the memory in the Data.MemoryData member of HCPLUGIN_VALUEtypedef struct HCPLUGIN_STRUCT_ENTRY_tag { unsigned long Size;HCPLUGIN_POSITION *Position; char *Name; unsigned long ValueIndex;struct HCPLUGIN_STRUCT_ENTRY_tag *Next; } HCPLUGIN_STRUCT_ENTRY; typedefstruct HCPLUGIN_VALUE_tag { unsigned long Size; HCPLUGIN_POSITION*Position; unsigned long Internal[5]; int TopLevel; char *Name;HCPLUGIN_VALUE_TYPE Type; union { struct { int Signed; unsigned longBase; unsigned long Width; void *Value; } ValueData; struct { unsignedlong *Elements; unsigned long Length; } ArrayData; HCPLUGIN_STRUCT_ENTRY*StructData; struct { unsigned long Length; } MemoryData; } Data; }HCPLUGIN_VALUE;

[0971] Members of HCPLUGIN_VALUE structure:

[0972] Size Set to sizeof(HCPLUGIN_VALUE) as a corruption check.

[0973] Position Source position of declaration of object.

[0974] Internal Internal data used by the debugger. Do not modify.

[0975] TopLevel Set to 1 if it's a top-level object or 0 otherwise.Examples of objects that are not top level are elements of arrays ormembers of structures. Used by the debugger.

[0976] Name Identifier of the object.

[0977] Type Type of object that this value represents. See above fordetails of the HCPLUGIN_VALUE_TYPE enumerated type.

[0978] Data Union containing the value data consisting ofData.ValueData, Data.ArrayData. data.StructData and Data.MemoryData.

[0979] Elements of HCPLUGIN_VALUE.Data

[0980] Data.ValueData is used to represent basic values (e.g. registersand signals) and contains the following members:

[0981] Signed Zero for an unsigned value, non-zero for a signed value.

[0982] Base Default base used to represent this value (specified usingthe base spec in the source code). Can be 2, 8, 10 or 16 or 0 for none.

[0983] Width Width of value in bits.

[0984] Value Pointer to value. If Width is less than or equal to 32 bitsthen this is a long * or unsigned long *. If Width is less than or equalto 64 bits then this is a _int64 * or unsigned _int64 *. If Width isgreater than 64 bits then this is a NUMLIB_NUMBER **. Data stored inlong, unsigned long, _int64 and unsigned _int64 types is left aligned.This means it occupies the most significant bits in the word and not theleast significant bits. For example, 3 stored in a 3 bit wide number ina 32-bit word is represented as 0x60000000. Functions usingNUMLIB_NUMBER structures are described hereinafter.

[0985] Data.ArrayData is used to represent array values and contains thefollowing members:

[0986] Elements Array of value indices of members of array. Theseindices can be passed to further calls to the get value function.

[0987] Length Number of elements in the array.

[0988] Data.StructData is used to represent structure values and pointsto the head of a NULL terminated linked list of structure memberobjects. See below for details of the HCPLUGIN_STRUCT_ENTRY structure.

[0989] Data.MemoryData is used to represent memory (RAM, ROM and WOM)values and contains the following members:

[0990] Length Number of elements in the memory.

[0991] Associated Functions

[0992] Use the callback function HCPLUGIN_GET_MEMORY_ENTRY_FUNC toaccess memory elements.

[0993] Simulator to Plugin Functions

[0994] These functions are called by the simulator to send informationto the plugin. They are called when simulation begins and ends, and atpoints in the simulator clock cycle. The plugin may act upon the call ordo nothing. The plugin may implement the function with identical nameand parameters.

[0995] PlugInOpen

[0996] void PlugInOpen(HCPLUGIN_INFO *Info, unsigned long NumInst)

[0997] The simulator calls this function the first time that the plugin.dll is used in a Handel-C session. Each simulator used may make onecall to this function for each plugin specified in the source code.

[0998] Info Pointer to structure containing simulator call backinformation.

[0999] NumInst Number of instances of the plugin specified in the sourcecode. One call to PlugInOpenlnstanceo may be made for each of theseinstances.

[1000] PlugInOpenInstance

[1001] void *PlugInOpenlnstance(char *Name, unsigned long NumPorts)

[1002] This function is called each time one starts a simulation. It iscalled once for each instance of the plugin in the Handel-C source code.An instance is considered unique if a unique string is used in theextinst specification. The plugin should return a value used to identifythe instance in future calls from the simulator. This value may bepassed to future calls to

[1003] PlugInOpenPort( ), PlugInSet( ), PlugInGet( ), PlugInStartCycle(),

[1004] PlugInMiddleCycle( ), PlugInEndCycle( ) and PlugInCloselnstance().

[1005] Name String specified in the extinst specification in the sourcecode.

[1006] NumPorts Number of ports associated with this instance. One callto PlugInOpenPort9 ) may be made for each of these ports.

[1007] PluginOpenPort

[1008] void *PlugInOpenPort(void *Instance, char *Name, int Direction,unsigned long Bits)

[1009] This function is called each time one starts a simulation. It iscalled once for each interface port associated with this plugin in thesource code. The plugin should return a value used to identify the portin future calls from the simulator. This value may be passed to futurecalls to lugInGet( ),

[1010] PlugInSet( ), and PlugInClosePort( ).

[1011] Instance Value returned by the PlugInOpenlnstance( ) function.

[1012] Name Name of the port from the interface definition in the sourcecode.

[1013] Direction Zero for a port transferring data from plugin tosimulator, non-zero for a port transferring data from simulator toplugin.

[1014] Bits Width of port.

[1015] PlugInSet

[1016] void PlugInSet(void *Instance, void *Port, unsigned long Bits,void *Value)

[1017] This function is called by the simulator to pass data fromsimulator to plugin. It is guaranteed to be called every time the valueon the port changes but may be called more often than that.

[1018] Instance Value returned by the PluglnOpenlnstance( ) function.

[1019] Port Value returned by the PlugInOpenPort( ) function.

[1020] Bits Width of port.

[1021] Value Pointer to value. If Bits is less than or equal to 32 bitsthen this is a long * or unsigned long *. If Bits is less than or equalto 64 bits then this is an int64 * or unsigned int64 *. If Bits isgreater than 64 bits then this is a NUMLIB_NUMBER **. Data stored inlong, unsigned long, _int64 and unsigned _int64 types is left aligned.This means it occupies the most significant bits in the word and not theleast significant bits. For example, 3 stored as a 3 bit wide number ina 32-bit word is represented as 0x60000000. Functions usingNUMLIB_NUMBER structures are described hereinafter.

[1022] Where 32 bit or 64 bit widths are used, data is stored in themost significant bits.

[1023] PlugInGet

[1024] void PlugInGet(void *Instance, void *Port, unsigned long Bits,void *Value)

[1025] This function is called by the simulator to get data from theplugin. One may use any name he or she wishes for this function(specified in by extfunc) but the parameters may remain the same.

[1026] Instance Value returned by the PlugInOpenInstance( ) function.

[1027] Port Value returned by the PlugInOpenPort( ) function.

[1028] Bits Width of port.

[1029] Value Pointer to value. If Bits is less than or equal to 32 bitsthen this is a long * or unsigned long *. If Bits is less than or equalto 64 bits then this is a _int64 (Microsoft specific type) * or unsigned_int64 *. If Bits is greater than 64 bits then this is a NUMLIB_NUMBER**. Data stored in long, unsigned long, _int64 and unsigned _int64 typesis left aligned. This means is occupies the most significant bits in theword and not the least significant bits. For example, 3 stored in a 3bit wide number in a 32-bit word is represented as 0x60000000. Functionsusing NUMLIB_NUMBER structures are described hereinafter.

[1030] Where 32 bit or 64 bit widths are used, data may be stored in themost significant bits. One may left-shift the number into the MSBs so itmay be read correctly by the Handel-C code.

[1031] PlugInStartCycle

[1032] void PlugInStartCycle(void *Instance)

[1033] This function is called by the simulator at the start of everysimulation cycle.

[1034] Instance Value returned by the PlugInOpenInstance( ) function.

[1035] PlugInMiddleCycle

[1036] void PlugInMiddleCycle(void *Instance)

[1037] This function is called by the simulator immediately before anyvariables within the simulator are updated.

[1038] Instance Value returned by the PlugInOpenInstance( ) function.

[1039] PlugInEnd Cycle

[1040] void PlugInEndCycle(void *Instance)

[1041] This function is called by the simulator at the end of everysimulation cycle.

[1042] Instance Value returned by the PlugInOpenInstance( ) function.

[1043] PlugInClosePort

[1044] void PlugInClosePort(void *Port)

[1045] The simulator calls this function when the simulator is shutdown. It is called once for every call made to PlugInOpenPort( ).

[1046] Port Value returned by the PlugInOpenPort( ) function.

[1047] PlugInCloseInstance

[1048] void PlugInCloseInstance(void *Instance)

[1049] The simulator calls this function when the simulator is shutdown. It is called once for every call made to PlugInOpenlnstance( ).

[1050] Instance Value returned by the PlugInOpenInstance( ) function.

[1051] PlugInClose

[1052] void PlugInClose(void)

[1053] The simulator calls this function when the simulator is shutdown. It is called once for every call made to PlugInOpen( ).

[1054] Simulator Callback Functions

[1055] The simulator callback functions are used by plugins to query thestate of variables within the Handel-C program. This can be used tomodel memory mapped registers or shared memory resources or to displaydebug values in non-standard representations (e.g. oscilloscope andlogic analyzer displays). The plugin receives pointers to thesefunctions in the Info parameter of the PlugInOpen( ) function call madeby the simulator at startup.

[1056] HCPLUGIN_ERROR_FUNC

[1057] typedef void (*HCPLUGIN_ERROR_FUNC)(void *State, unsigned longLevel,char *Message);

[1058] The plugin should call this function to report information,warnings or errors. These messages may be displayed in the GUI debugwindow. In addition, an error may stop the simulation. State Statemember from the HCPLUGIN_INFO structure passed to the PlugInOpen( )function.

[1059] Level 0 Information

[1060] 1 Warning

[1061] 2 Error.

[1062] Message Error message string.

[1063] HCPLUGIN_GET_VALUE_COUNT_FUNC

[1064] typedef unsigned long (*HCPLUGIN_GET_VALUE_COUNT_FUNC) (void*State);

[1065] The plugin should call this function to query the number ofvalues in the simulator. This number provides the maximum index for theHCPLUGIN_GET_VALUE_FUNC function.

[1066] State State member from the HCPLUGIN_INFO structure passed to thePlugInOpen( ) function.

[1067] HCPLUGIN_GET_VALUE_FUNC

[1068] typedef void (*HCPLUGIN_GET_VALUE_FUNC)(void *State, unsignedlong Index, HCPLUGIN_VALUE *Value);

[1069] The plugin should call this function to get a variable value fromthe simulator. State State member from the HCPLUGIN_INFO structurepassed to the PlugInOpen( ) function. Index Index of the variable.Should be between 0 and the one less than the return value of theHCPLUGIN_GET_VALUE_COUNT_FUNC function inclusive.

[1070] A map of index to variable name can be built up at startup byrepeatedly calling this function and examining the Value structurereturned.

[1071] Value Structure containing information about the value.

[1072] HCPLUGIN_GET_MEMORY_ENTRY_FUNC

[1073] typedef void (*HCPLUGIN_GET_MEMORY_ENTRY_FUNC) (void *State,unsigned long Index, unsigned long Offset, HCPLUGIN_VALUE *Value);

[1074] The plugin should call this function to get a memory entry fromthe simulator.

[1075] State State member from the HCPLUGIN_INFO structure passed to thePlugInOpen( ) function.

[1076] Index Index of the variable. Should be between 0 and one lessthan the return value of the HCPLUGIN_GET_VALUE_COUNT_FUNC functioninclusive. A map of index to variable name can be built up at startup byrepeatedly calling this function and examining the Value structurereturned.

[1077] Offset Offset into the RAM. For example, to obtain the value ofx[43], Index should refer to x and this value should be 43.

[1078] Value Structure containing information about the value.

[1079] Example

[1080] This example consists of three files:

[1081] A Handel-C file which invokes the plugin through interfaces

[1082] An ANSI-C file containing the plugin functions

[1083] An ANSI-C header file defining the plugin structures

[1084] Plugin file: plugin-Demo.c

[1085] This simple example has one function (MyBusOut) that reads avalue from a simulator interface and one function (MyBusln) that doublesa value and writes it to a simulator interface.

[1086] It responds to the calls to PlugInOpenInstance( ) andPlugInOpenPort( ) by returning NULL. All the other required pluginfunctions have been defined but do nothing. #include “plugin.h” #definedll _declspec (dllexport) dll void PlugInOpen (HCPLUGIN_INFO *Info,unsigned long NumInst) { //this function intentionally left blank//intialisating before the first simulation is run } dll voidPlugInClose (void) { //tidy-up after final simulation } dll void*PlugInOpenInstance (char *Name, unsigned long NumPorts) { //invokedwhen one starts a simulation //initialize anything required for thissimulation return NULL; } dll void PlugInCloseInstance (void *Instance){ } dll void *PlugInOpenPort (void *Instance, char *Name, int Direction,unsigned long Bits) { //an opportunity to initialize any data structuresassociated with //this port and return the pointer associated with it(which could //then be passed to PlugInSet, etc.) return NULL; } dllvoid PlugInClosePort (void *Port) { } static long DataIn; dll voidMyBusOut (void *Instance, void *Port, unsigned long Bits, void *Value) {DataIn = * (long *) Value; } dll void MyBusIn (void *Instance, void*Port, unsigned long Bits, void *Value) { * (long *) Value = DataIn*2; }dll void PlugInStartCycle (void *Instance) { //call after start of clockcycle //possibly useful with non-standard clocks } dll voidPlugInMiddleCycle (void *Instance) { } dll void PlugInEndCycle (void*Instance) { }

[1087] C Header File: plugin.h

[1088] This is provided on the installation CD. It contains declarationsof the required structures.

[1089] Handel-C file: plugin-demo.c set clock = internal “1”; int 8 a,b; macro expr MyOutExpr = a; interface bus_out () MyBusOut (MyOutExpr)with {extlib=“pluginDemo.dll”, extinst=“0”, extfunc=“MyBusOut”};interface bus_in (int 8) MyBusIn () with {extlib=“pluginDemo.dll”,extinst=“0”, extfunc=“MyBusIn”}; void main (void) { for (a=1; a<10; a++){ b = MyBusIn.in; }

[1090] Numlib Library

[1091] The numlib.dll library is provided. This contains a series ofroutines to deal with values that are greater than 64 bits wide. Thesenumbers are stored in a NUMLIB_NUMBER structure and these routines usethis structure to operate on. There are routines to convertNUMLIB_NUMBER structures to 32 and 64-bit values.

[1092] These routines can be accessed by including the header filenumlib.h. Their functions are: Number allocation and de-allocationEXPORT void NumLibNew(NUMLIB NUMBER **Num, unsigned long Width) GrabWidth space for value indirectly pointed to by Num. Provide pointer tospace acquired in Num.

[1093] For example:

[1094] NUMLIB_NUMBER *Fred;

[1095] NumLibNew(&Fred, 453);

[1096] EXPORT void NumLibFree(NUMLIB_NUMBER *Num) Free grabbed space forvalue pointed to by Num.

[1097] For example: NumLibFree(Fred);

[1098] General number handling routines

[1099] EXPORT void NumLibSet(char *a, NUMLIB_NUMBER *Result) Set valuepointed to by Result to the value of string a.

[1100] For example: NUMLIB_NUMBER *Fred; NumLibNew (&Fred, 453) ;NumLibSet (“1245216474847832194873205083294”, Fred) ;

[1101] EXPORT void NumLibCopy(NUMLIB_NUMBER *Source, NUMLIB_NUMBER*Result) Copy value pointed to by Source to value pointed to by Result.

[1102] EXPORT void NumLibPrint(unsigned long Base, int Signed,NUMLIB_NUMBER *Source)Print value pointed to by Source to screen in Base(display as signed or unsigned according to Signed). If Signed isnon-zero, number is treated as signed (e.g. “−1”). If Signed is zero,numbers may be treated as unsigned (e.g. “255”)

[1103] EXPORT void NumLibPrintFile(FILE *FilePtr, unsigned long Base,int Signed,

[1104] NUMLIB_NUMBER *Source) Write value pointed to by Source to filepointed to by FilePtr as above.

[1105] EXPORT unsigned long NumLibPrintString(char *Buffer, unsignedlong BufferLength, unsigned long Base, int Signed, NUMLIB_NUMBER*Sourceln). Write value pointed to by SourceIn as string to Buffer ingiven Base (length of Buffer given in Bufferlength).

[1106] BufferLength is the maximum length that may be written.

[1107] EXPORT uint32 NumLibBits(NUMLIB_NUMBER *a) Calculate the width ofvalue pointed to by a and return number of bits (i.e. return the widthof a specified in NumLibNew).

[1108] EXPORT void NumLibSetBit(NUMLIB_NUMBER *a, uint32 Bit, int Value)Set bit Bit of value pointed to by a to Value (0 or 1).

[1109] EXPORT int NumLibGetBit(NUMLIB_NUMBER *a, uint32 Bit) Get valueof bit Bit of value pointed to by a.

[1110] EXPORT int32 NumLibGetLong(NUMLIB_NUMBER *a) Convert valuepointed to by a to 32 bits and return it. The least significant bits areused and the result is right aligned (i.e. normal numbers not pluginstyle numbers).

[1111] EXPORT int64 NumLibGetLongLong(NUMLIB_NUMBER *a) Convert valuepointed to by a to 64 bits and return it. The least significant bits areused and the result is right aligned (i.e. normal numbers not pluginstyle numbers).

[1112] EXPORT void NumLibWriteFile(NUMLIB_NUMBER *a, FILE *FilePtr)Write value pointed to by a in binary format to file pointed to byFilePtr.

[1113] EXPORT void NumLibReadFile(NUMLIB_NUMBER *a, FILE *FilePtr) Readbinary format number from a file pointed to by FilePtr and put theresult in a. This is the reverse of NumLibWriteFile. The width of a maybe correct. E.g. NUMLIB_NUMBER *Fred; FILE *FilePointer = fopen(“file.dat”, “rb”) ; NumLibNew (&Fred, 453) ; NumLibReadFile (Fred,FilePointer) ; Arithmetic operations

[1114] Note that in Handel-C, one can only do signed by signed orunsigned by unsigned division and cannot mix types. All operations areHandel-C like, and require some widths and/or type information.

[1115] EXPORT void NumLibUMinus(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b); b=−a

[1116] EXPORT void NumLibAdd(NUMLIB NUMBER *a, NUMLIB NUMBER *b,

[1117] NUMLIB_NUMBER *Result) Result = a + b

[1118] EXPORT void NumLibSubtract(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,

[1119] NUMLIB_NUMBER *Result) Result = a − b

[1120] EXPORT void NumLibMultiply(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,

[1121] NUMLIB_NUMBER *Result) Result = a * b

[1122] EXPORT void NumLibDivide(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b, intSigned,

[1123] NUMLIB_NUMBER *Result) Result = a / b. All numbers treated assigned or unsigned, depending on the value of Signed.

[1124] EXPORT void NumLibMod(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b, intSigned,

[1125] NUMLIB_NUMBER *Result) Result = a % b. All numbers treated assigned or unsigned, depending on the value of Signed.

[1126] EXPORT void NumLibDivMod(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b, intSigned,

[1127] NUMLIB_NUMBER *DivResult, NUMLIB_NUMBER *ModResult) DivResult = a/ b,

[1128] ModResult = a % b. All numbers treated as signed or unsigned,depending on the value of Signed.

[1129] Comparisons

[1130] EXPORT unsigned long NumLibCompareEq(NUMLIB_NUMBER *a, char *b)Return result of comparison of number a to string b Equivalent to:NUMLIB_NUMBER *Temp; unsigned long Res; NumLibNew (&Temp, a->Width) ;NumLibSet (b, Temp) ; NumLibEquals (a, Temp, &Res) ; NumLibFree (Temp) ;return Res;

[1131] EXPORT void NumLibEquals(NUMLIB_NUMBER *a, NUMLIB NUMBER *b,unsigned long *Result) Return result of (a == b)

[1132] EXPORT void NumLibNotEquals(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,unsigned long *Result); Return result of (a != b)

[1133] EXPORT void NumLibSGT(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,unsigned long *Result); Return result of (a > b) (a and b signed)

[1134] EXPORT void NumLibSGTE(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,unsigned long *Result) Return result of (a >= b) (a and b signed)

[1135] EXPORT void NumLibSLT(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,unsigned long *Result) Return result of (a < b) (a and b signed)

[1136] EXPORT void NumLibSLTE(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,unsigned long *Result) Return result of (a <= b) (a and b signed)

[1137] EXPORT void NumLibUGT(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,unsigned long *Result) Return result of (a > b) (a and b unsigned)

[1138] EXPORT void NumLibUGTE(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,unsigned long *Result) Return result of (a >= b) (a and b unsigned)

[1139] EXPORT void NumLibULT(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,unsigned long *Result) Return result of (a < b) (a and b unsigned)

[1140] EXPORT void NumLibULTE(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,unsigned long *Result) Return result of (a <= b) (a and b unsigned)

[1141] EXPORT void NumLibCond(unsigned long *Condition, NUMLIB_NUMBER*a, UMLIB_NUMBER *b,

[1142] NUMLIB_NUMBER *Result); Return result of Condition ? a: b.Equivalent to: id (*Condition==0) { NumLibCopy (b, Result) ; } else {NumLibCopy (a, Result) ; } Bitwise operations

[1143] EXPORT void NumLibNot(NUMLIB_NUMBER *a, NUMLIB_NUMBER *Result)Value pointed to by Result =˜a

[1144] EXPORT void NumLibAnd(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,

[1145] NUMLIB_NUMBER *Result) Value pointed to by Result = a & b

[1146] EXPORT void NumLibOr(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,

[1147] NUMLIB_NUMBER *Result) Value pointed to by Result = a | b

[1148] EXPORT void NumLibXor(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,

[1149] NUMLIB_NUMBER *Result) Value pointed to by Result = a ^ b

[1150] Concatenation Operations

[1151] In all the functions the int32 and int64 values are left alignedin line with the plugin interface.

[1152] EXPORT void NumLibCat64_(—)32(uint64 *a, unsigned long wa,unsigned long *b, unsigned long wb, NUMLIB_NUMBER *Result) Concatenatewa bits of 64 bit a and wb bits of 32 bit b and place it in valuepointed to by Result. Value pointed to by Result = int wa a @ int wb b

[1153] EXPORT void NumLibCat32_(—)64(unsigned long *a, unsigned longwa,uint64 *b, unsigned long wb, NUMLIB_NUMBER *Result) Concatenate wabits of 32 bit a and wb bits of 64 bit b and place it in value pointedto by Result. Value pointed to by Result = int wa a @ int wb b

[1154] EXPORT void NumLibCat64_(—)64(uint64 *a, unsigned long wa, uint64*b, unsigned long wb,

[1155] NUMLIB_NUMBER *Result) Concatenate wa bits of 64 bit a and wbbits of 64 bit b and place it in value pointed to by Result. Valuepointed to by Result = int wa a @ int wb b

[1156] EXPORT void NumLibCat32_n(unsigned long *a, unsigned long wa,NUMLIB_NUMBER *b,NUMLIB_NUMBER *Result) Concatenate wa bits of 32 bit awith value b and place it in value pointed to by Result. Value pointedto by Result = int wa a @ b

[1157] EXPORT void NumLibCatn_(—)32(NUMLIB NUMBER *a, unsigned long *b,unsigned long wb,

[1158] NUMLIB_NUMBER *Result)Concatenate value a with wb bits of 32 bitb and place it in value pointed to by Result. Value pointed to by Result= a @ int wb b

[1159] EXPORT void NumLibCat64_n(uint64 *a, unsigned long wa,NUMLIB_NUMBER *b,

[1160] NUMLIB_NUMBER *Result) Concatenate wa bits of 64 bit a with valueb and place it in value pointed to by Result. Value pointed to by Result= int wa a @ b.

[1161] EXPORT void NumLibCatn_(—)64(NUMLIB_NUMBER *a, uint64 *b,unsigned long wb,

[1162] NUMLIB_NUMBER *Result) Concatenate value a with wb bits of 64 bitb and place it in value pointed to by Result. Value pointed to by Result= a @ int wb b

[1163] EXPORT void NumLibCat(NUMLIB_NUMBER *a, NUMLIB_NUMBER *b,

[1164] NUMLIB_NUMBER *Result); Concatenate value a with value b andplace it in value pointed to by Result. Value pointed to by Result= a @b

[1165] Drop Operations

[1166] EXPORT void NumLibDrop32(NUMLIB_NUMBER *a, unsigned long b,unsigned long *Result) Drop b bits from a and place it in 32-bit Result.Value pointed to by Result = a \\ b

[1167] EXPORT void NumLibDrop64(NUMLIB_NUMBER *a, unsigned long b,uint64 *Result) Drop b bits from a and place it in 64-bit Result. Valuepointed to by Result = a \\ b

[1168] EXPORT void NumLibDrop(NUMLIB_NUMBER *a, unsigned long b,NUMLIB_NUMBER *Result) Drop b bits from a and place it in Result.Valuepointed to by Result =a \\b

[1169] Take Operations

[1170] EXPORT void NumLibTake32(NUMLIB_NUMBER *a, unsigned long b,unsigned long *Result) Take b bits from a and place it in 32-bit Result.Value pointed to by Result= a <− b

[1171] EXPORT void NumLibTake64(NUMLIB_NUMBER *a, unsigned long b,uint64 *Result) Take b bits from a and place it in 64-bit Result. Valuepointed to by Result= a <− b

[1172] EXPORT void NumLibTake(NUMLIB NUMBER *a, unsigned long b, NUMLIBNUMBER *Result) Take b bits from a and place it in Result. Value pointedto by Result= a <− b

[1173] Shift Operations

[1174] EXPORT void NumLibLSL(NUMLIB_NUMBER *a, unsigned long b,NUMLIB_NUMBER *Result) Result = a << b

[1175] EXPORT void NumLibLSR(NUMLIB_NUMBER *a, unsigned long b,NUMLIB_NUMBER *Result) Result = a >> b. Logical right-shift: the topbits are zero-padded.

[1176] EXPORT void NumLibASR(NUMLIB_NUMBER *a, unsigned long b,NUMLIB_NUMBER *Result) Result =a >>b Arithmetic right-shift: the topbits are sign-extended.

[1177] Bit Selection Operations

[1178] EXPORT void NumLibBitRange32(NUMLIB_NUMBER *a, unsigned long b,unsigned long c, unsigned long *Result) 32 bit value pointed to byResult = a [b − 1: c]

[1179] EXPORT void NumLibBitRange64(NUMLIB_NUMBER *a, unsigned long b,unsigned long c, uint64 *Result) 64 bit value pointed to by Result= a [b− 1: c]

[1180] EXPORT void NumLibBitRange(NUMLIB_NUMBER *a, unsigned long b,unsigned long c, NUMLIB_NUMBER *Result).Result = a [b − 1: c]

[1181] Bit Insertion Operations

[1182] EXPORT void NumLibInsert32(unsigned long *a, unsigned long wa,unsigned long s,

[1183] NUMLIB_NUMBER *Result) Insert bits of a into Result with LSB atposition s. Width a is wa and a is <= 32 bits wide.

[1184] EXPORT void NumLibInsert64(uint64 *a, unsigned long wa, unsignedlong s,

[1185] NUMLIB_NUMBER *Result) Insert bits of a into Result with LSB atposition s. Width a is wa and a is <= 64 bits wide.

[1186] EXPORT void NumLibInsert(NUMLIB_NUMBER *a, unsigned long s,NUMLIB_NUMBER *Result) Insert bits of a into Result with LSB at positions.

[1187] Plugins Supplied

[1188] The following plugins are supplied to assist in simulatingHandel-C programs. sharer.dll allows a port to be used by more than oneplugin.

[1189] synchroniser.dll synchronizes Handel-C simulations so that theyrun at the correct rate relative to one another.

[1190] connector.dll connects simulation ports together so that data canbe exchanged between simulations.

[1191] 7-segment.dll simulate a 7-segment display.

[1192] Sharing a Port Between Plugins: sharer.dll

[1193] One can share a port between two or more plugins. One can shareoutput ports to distribute the same data to multiple plugins. Inputports can be shared so that more than one plugin can feed data into theprogram (for example, to simulate tri-state ports). If more than oneplugin provides data to the same port on the same clock cycle, the lastpiece of data fetched is the one used.

[1194] Syntax

[1195] To share a port, the with specification of the port or interfacemay contain: extlib=“sharer.dll” extfunc=“SharerGetSet” extinst =“ShareRecords”

[1196] The ShareRecords string consists of a Share record for everyplugin which a port needs to be connected to. Share records have thefollowing syntax: Share={extlib=<lib-name>, extinst=<extinst-string>,extfunc=<func-name>} The items within angle brackets have the samemeaning as they have when they occur as the extlib, extinst and extfuncfields. FIG. 43 illustrates a plurality of possible values and meanings4300 associated with libraries of the present invention. interfacebus_out () seg7_output (encode_out) with {extlib=“sharer.dll”, extinst=“\ Share={extlib=<7segment.dll>, extinst=<A>, extfunc=<PlugInSet>} \Share={extlib=<connector.dll>, extinst=<SS (7) >, \extfunc=<ConnectorGetSet>} ”, extfunc=“SharerGetSet” } ;

[1197] Synchronizing Multiple Simulations: synchroniser.dll

[1198] If one wants to simulate multiple programs with different clockperiods, one can use the synchroniser.dll. One then informs thesynchronizer of their relative clock rates. The synchronizer may suspendsimulations until they can complete a cycle in step with othersimulations.

[1199] If one is single-stepping several synchronized simulations, somemay be suspended until he or she has stepped other simulations to apoint where the cycles coincide. There may always be at least onesimulation that can be stepped.

[1200] To complete a simulation that is synchronized with other pausedsimulations (i.e. in break mode), one may have to single step the pausedsimulations until the finishing simulation can complete.

[1201] Syntax

[1202] To invoke synchroniser.dll, one may use the following withspecifications in the set clock statement: extlib=“synchroniser.dll”extfunc=“SynchroniserGetSet” extinst=“clockPeriod”

[1203] The clockPeriod string may contain a positive integer thatrepresents the period of the clock. This is assumed to be in the sametime units for all simulations that are to be synchronized. set clock =external “P1” with {extlib=“synchroniser.dll”, extinst=“100”,extfunc=”SynchroniserGetSet”} ;

[1204] Connecting Simulations Together: connector.dll

[1205] The connector allows one to connect two simulations together.

[1206] Syntax

[1207] One may connect a simulation to connector.dll by specifying thefollowing in the with specification for a port. extlib=“connector.dll”,extinst=“terminalName (width) [[bitRange]]”, extfunc=“ConnectorGetSet”

[1208] Where:

[1209] terminalName is the name of the virtual terminal that the port isconnected to. It may be any Handel-C identifier. All ports connected toterminalName are connected together. The terminal may be created if itdoes not exist.

[1210] width is the width of the terminal in bits. This may be the samefor every occurrence of the same terminal name.

[1211] [bitRange] is optional. It specifies which bits of the port areconnected to which bits of the terminal. If used, bitRange may specifythe connections for all bits within the port.

[1212] Port bits are defined by their position within bitRange; terminalbits are specified by value. The first (leftmost) value in bitRangerepresents the most significant port bit, and the last (rightmost) valuethe least significant port bit. Terminal bits can be specified as aninclusive range [n:n], or a number. To leave a port bit unconnected,specify X as its terminal bit value.

[1213] If bitRange is omitted, bit 0 of the port may be connected to bit0 of the terminal, bit 1 to bit 1 etc. The string extinst =“connect1(16)[13,14,X,X,11:8]” connects an 8-bit port to a 16-bitterminal connect1 with the cross-connections below in Table 1. TABLE 1Port bits Terminal bits 0  8 1  9 2 10 3 11 4  X 5  X 6 14 7 13

[1214] // Program A interface interface bus_out ( ) seg7_output(encode_out) with { extlib=“connector.dll”, extinst=“SS(7)”,extfunc=“ConnectorGetSet”}; // Program B interface interface bus_in(unsigned 7 in) seg7_input ( ) with {extlib=“connector.dll”,extinst=“SS(7)”, extfunc=“ConnectorGetSet”};

[1215] Simulating a 7-segment Display: 7segment.dll

[1216] The 7 segment didplay allows one to connect a simulation of aseven segment display to a 7-bit wide output port.

[1217] Syntax

[1218] One may connect to 7segment.dll by specifying the following inthe with specification for the 7-bit wide output port:

[1219] extlib = “7segment.dll”

[1220] extinst = “windowName”

[1221] extfunc = “PlugInSet”

[1222] When the Handel-C program is simulated, a window containing asingle 7-segment display appears. The window has the title windowName.The program may invoke any number of 7-segment display windows. Thesegments correspond to the following bits (where bit 0 is the leastsignificant bit). A bit value of 0 turns the segment on, 1 turns it offThe following array encodes the digits 0 to 9 to drive the 7segment.dll.

[1223] unsigned 7 encoder[10]={0×01, 0×4f, 0×12,0×06, 0×4c, 0×24, 0×20,0×0f, 0×00, 00×04};

[1224] Example

[1225] This example consists of two separate Handel-C projects: ProjectA and Project B.

[1226] Project A:

[1227] Increments a modulo-10 counter every cycle and outputs the valueof the counter to the 7segment.dll plugin.

[1228] Outputs the value of the counter to the terminal called SS(7)every cycle.

[1229] Project A's cycles are 100 time units long.

[1230] Project B:

[1231] Increments a modulo-10 counter on alternate cycles and outputsthe value of the counter to the 7segment.dll plugin.

[1232] Alternate cycles, reads the value from the terminal called SS(7)and outputs it to the 7segment.dll plugin.

[1233] Project B's cycles are 50 time units long.

[1234] Project B may be stepped twice for every step of project A.

[1235]FIG. 44 shows how the synchronization 4400 works whensingle-stepping the two projects in simulation.

[1236] At point 1 both simulations are ready to step. If one stepsProject B first, it may suspend at point 2, as it cannot continue untilA has caught up. A may be stepped. It may suspend before 4, as it waitsfor B to catch up. Meanwhile, B can complete its first step to reach 3.One can then step B, so that it can catch up with A, and both projectsare ready to step. If one steps Project A first, it suspends, as it maywait for B to reach 4 before it can continue. Now he or she may stepProject B. When B is stepped, it reaches 3. A may still wait. When B isstepped again, it catches A, and both A and B are ready to continue.

[1237] If one single step the example above, two 7 segment displaywindows appear.

[1238] Once both simulations have passed the initialization part andentered the main loop, the windows should display these numbers.

[1239] Time units 0 50 100 150 200 250 200 250 300 350

[1240] A window 0 0 0 0 1 1 2 2 3 3 4 . . .

[1241] B window 0 1 0 2 1 3 2 4 3 5 . . . Source file for Project A: setclock = external “P1” with {extlib=“synchroniser.dll”,extinst=“100”,extfunc=“SynchroniserGetSet”}; signal unsigned 7 encode_out; interfacebus_out ( ) seg7_output (unsigned 7 output = encode_out ) with{extlib=“sharer.dll” extinst=″ \ Share={extlib=<7segment.dll>,extinst=<A>; extfunc=<PlugInSet>} \ Share={extlib=<connector.dll>,extinst=<SS(7)>, \ extfunc=<ConnectorGetSet>} ″,extfunc=“SharerGetSet”}; //Define values to light 7-segment display from 0-9 rom encoder [10]= {0x01, 0x4f, 0x12, 0x06, 0x4c, 0x24, 0x20, 0x0f, 0x00, 0x04}; voidmain (void) { unsigned 4 count; count = 0; while (1) { par { count =(count==9) ? 0 : (count+1); encode_out = encoder [count]; } } Sourcefile for Project B: set clock = external “P1” with{extlib=“synchroniser.dll”, extinst=“50”, extfunc=“SynchroniserGetSet”};signal unsigned 7 encode_out; interface bus_out ( ) seg7_output(unsigned 7 output = encode_out) with {extlib=“7segment.dll”,extinst=“B”, extfunc=“PlugInSet”}; interface bus_in (unsigned 7 in)seg7_input( ) with {extlib=“connector.dll”, extinst=“SS(7)”,extfunc=“ConnectorGetSet”}; //Define values to light 7-segment displayfrom 0-9 rom encoder [10] = {0x01, 0x4f, 0x12, 0x06, 0x4c, 0x24, 0x20,0x0f, 0x00, 0x04}; void main (void) { unsigned 4 count; count = 0; while(1) { par { count = (count==9) ? 0 : (count+1); encode_out = encoder[count]; } encode_out = seg7_input.in;

[1242] More information regarding cosimulation will now be set forth.

[1243] WP26 Cosimulation Tool

[1244] The present section proposes a number of interfaces to be used toenable multiple simulators to be used together in a generic fashion.First of all the objectives of the present embodiment are explained.

[1245] Objectives

[1246] This section aims to establish a technique to enable multiplesimulators to cosimulate with each other without having to rewritesimulator-specific plugin code.

[1247] It should be possible to makesimulation-accuracy/simulation-speed trade-off decisions, so thatdifferent parts of the cosimulation execute with the desired degree ofaccuracy/speed.

[1248] Users of the simulators used in cosimulation should be able towrite (in Handel-C, VHDL, C or whatever) the models being simulatedindependently of any other part of a cosimulation arrangement. This mayenable reuse of models from one cosimulation arrangement to another.

[1249] Issues

[1250] Logic Values

[1251] High-Impedance/Tri-State Simulation:

[1252] Some support for high-impedance states are beneficial for makingsimulation components modular when buses (or other wires that may bedriven by different components at different times) are involved.

[1253] Internal Resistance:

[1254] Helps model pull-up/pull down resistors, and keep modelsindependent for digital circuits three levels should be adequate: zero,infinite, and ‘some’.

[1255] ‘Unknown’ Values:

[1256] If a floating input is read the result may be unpredictable,similarly if a circuit with a pull-up resistor is linked to a circuitwith a pull-down resistor, there's nothing driving the circuit. In thesesituations rather than picking an arbitrary result, propagating an‘unknown’ result may be more informative. 9 valued logic(U,X,0,1,Z,W,L,H,-) (uninitialized, strong unknown, strong 0, strong 1,hi impedance, weak unknown, weak 0, weak 1, don't care) VHDL, Swift/OMI,IEEE 1164 4 valued logic (0,1,Z,X) Verilog, SystemC 2 valued logic (0,1)Cynlib, Handel-C 120 valued logic! Verilog, OMI, IEEE 1364,

[1257] (Most of these are derived from permutations of different degreesof uncertainty of the values and strengths of values, each value isrepresented by a strength component combined with a strength 1component. The strengths range from: Supply, Strong, Pull, Large,Medium, Small, HiZ)

[1258] Using a two-valued system is fastest, but not entirely accurate.If one wishes to be able to determine if an LED may light up, he or sheneeds to be able to distinguish high-impedance from a logic zero. Beingable to represent high-impedance also enables one to identify if twocircuits are trying to drive a wire at the same time and flag this errorto the user. High-impedance is also useful when the direction ofinformation flow is not known. This isn't an issue for data-buses forexample, as the write-enable line can tell one which way the data isflowing, but if he or she wished to model a switch linking two busestogether, a simple two-valued logic system would run into trouble.

[1259] For fast simulation of correct circuits where logic values areused purely for passing information (not lighting up LEDs etc) and thedirection of information flow is known by the connected circuit elementsthen two-valued logic is sufficient.

[1260] Event-Based and Cycle-Based Simulation:

[1261] Some simulators are event based (ModelSim) some are cycle based(Handel-C, ARMulator, SingleStep). Event based simulation is moregeneral as it determines on-the-fly what needs to be simulated when.State based simulations run according to a predetermined order ofexecution, this may give them a speed advantage.

[1262] When integrating event-based simulators, the ideal order ofexecution is not obvious. If one considers the following cosimulationarrangement:

[1263]FIG. 44A illustrates a pair of simulators 4450, in accordance withone embodiment of the present invention. In this diagram, the dottedline 4452 represents dependencies, and the solid arrows 4454 areconnections between simulators. If both simulators were cycle-based thenthe ideal order of execution would be one which didn't require eithersimulator to repeat a simulation cycle. This is achieved bysynchronizing the simulators at a fine-grain enough level for changes inA to propagate down through to E in one simulation cycle. Thisscheduling order can be referred to as being Interleaved.

[1264] If both the simulators in the above arrangement were event-based,the natural order of evaluation would be to have each simulator wait forchanges on their inputs, and then propagate the effects of these changesto their outputs. Thus simulators 1 and 2 each execute three and fourtimes respectively. This scheduling order can be referred to asFully-propagated.

[1265] If one simulator were cycle-based and one event-based, then thecycle-based simulator may be quicker if one uses a relativelyfine-grained level of synchronization and only simulate the cycle once.However the event-based simulator may benefit from getting all itsinputs at once and not one at a time. The work required by anevent-based simulator to propagate the effects of the input-events tothe outputs may be duplicated by feeding inputs in one at a time. Alsoif multiple input-events occur at once, they may cancel each other outin a way that saves an expensive computation. For example, if two inputsare fed into an xor gate, the output of which triggers some expensivecomputation, then if both inputs to the xor change, it makes a bigdifference if they occur simultaneously or sequentially.

[1266] When cosimulating with event-based and cycle-based simulators itmay be desirable to enable the user to decide whether the simulatorscheduling used should be most suited to cycle-based or event-basedsimulators. One can make an event-based simulator look like acycle-based one, and a cycle-based simulator look like an event-basedone, the question is which approach is best, and the answer is likely tobe different in different circumstances. Supporting hardware emulators:

[1267] Using hardware to emulate a target board is common practice.However co-simulation is done, it shouldn't preclude the possibility ofmixing hardware emulation with software simulation.

[1268] Multi-Processor Systems:

[1269] The cosimulation methods used should be able take advantage ofmultiple processors and possibly multiple computers. The extent to whichparallelism can be exploited is influenced by the proportion ofcomputation to communicaiton/syncronisation. Synchronization over anetwork is viable, despite potentional of communications overhead. Acost-benefit analysis may be necessary prior to implementation. For veryfast simulators, the communications overhead of synchronizing thesimulators may be greater than the benefits gained when dealing with twoprocesses on the same multiprocessor computer. However without radicalrestructuring of the implementations of all (but one) of the simulatorsbeing used, one may incur the possible synchronization overhead.

[1270] Buffering Communication Between Simulators:

[1271] When the degree of communication between simulators is low,allowing the simulators to run ahead of each other can reduce the amountof context switching between processes and increase simulation speed.Using a cosimulation scheme which doesn't preclude such optimizationsmay be beneficial. When debugging, having simulators running ahead ofeach other may cause problems, if the simulator lagging behind reaches abreakpoint before catching up with the other simulator, then the usermay see the two simulators in an inconsistent state.

[1272] Starting and Identifying Simulators:

[1273] Whether the user should manually start each simulator or whethera co-simulation program should launch them is also an issue. If twoinstances of the same simulator are automatically started then they canbe passed arguments through environment variables so they identifythemselves to the co-simulator program. If the simulators are manuallystarted from the start menu, they may gain their identity after theyestablish communication with the cosimulation program.

[1274] A pool of simulators could be used to avoid repeated starting andquitting of simulators. When a cosimulation sessions ends, thesimulators would enter a pool of available simulators ready for anothercosimulation session. Simulators typically provide a window wherearbitrary output can be sent, this would be used to indicate whichsimulator was doing what.

[1275] Automatically Controlling the Simulators

[1276] Ideally the state of a simulator could be controlled on startupand during execution. For example to simulate the Kompressor board onstartup one doesn't want to have to require the user to load up threedifferent programs for the two FPGAs and the processor.

[1277] Similarly when the one FPGA or the processor reconfigures theother FPGA one doesn't want to involve the user. SingleStep and ModelSimboth provide scripting languages which may help in these situations. Thememory in the ARMulator can be set by plugins, but there doesn't appearto be a way for plug ins to change the associated symbol tables anddebugging information. Handel-C doesn't enable plugins to change thecircuit currently being simulated.

[1278] Integrating Simulator GUIs

[1279] It's relatively easy to co-simulate simulators together by havingeach pretend to be peripheral hardware plugged into the others. Eachsimulator thinks it's in charge, and has no knowledge that other fullyfledged simulators with their own GUls are being used for the pluginperipheral hardware. If one wishes to be able to use the debuggingfunctions of one GUI to control all the simulators, then the pluginsneed a way to pause and resume the simulators. A fudge would be to havethe plug ins prompts the user to pause and resume the simulators, butthis would quickly become tedious and annoying. ModelSim enables pluginsto pause the simulation, but it doesn't enable them to resumesimulation. Other simulators (Handel-C, ARMulator, SingleStep) don'tallow plug ins to pause simulation.

[1280] Another issue arises from the different simulators allowingsimulation to stop at different times. SingleStep only allows simulationto stop between instructions. Handel-C only allows simulation to stopbetween clock cycles, ModelSim allows simulation to stop anywhere. Thiswould be a problem for example when one wishes to advance time by lessthan a clock cycle in ModelSim, if the ModelSim simulation relied onasynchronous circuits simulated by Handel-C then the Handel-C GUI wouldnot be available mid clock cycle. It may also be a problem whencosimulating two microprocessors if the instructions on differentmicroprocessors don't start and finish on the same clock cycles.

[1281] Depending on the level of communication between two simulators,it may be possible to allow one to run ahead of another so both can bestopped, this may be confising for the user though, as each simulatorwould have a different idea of what the time was.

[1282] Integrate with Other Vendors Cosimulation Tools

[1283] Synopsys, Mentor Graphics, Cadence, Innoveda and Arexsys allprovide cosimulation tools (Eaglei, Seamless, Affirma HW/SW Verifier,Virtual-CPU and CosiMate respectively). These enable the integration ofa variety of HDL simulators with Instruction Set Simulators (ISS).

[1284] The cosimulation tools from Synopsys' Eaglei, Mentor Graphics'Seamless and Arexsys' CosiMate all provide support for a wide range ofHDL simulators. Processor support for Synopsys' Eaglei and MentorGraphics' Seamless is provided mostly though Mentor Graphics' XRA Ydebuggers, and Arexsys° CosiMate support “a wide range of C debuggers”.Integrating the Handel-C simulator into one or all of these cosimulationtools is possible.

[1285] There's a fair amount of documentation on Mentor GraphicsSeamless CVE (CoVerification Environment). One important aspect ofSeamless is the way it can speed up execution by reducing the amount oftime spent simulating hardware. This enables the ISS to proceedunhindered for a significant amount of time. Minimizing hardwaresimulation is done through a number of optimizations: data accessoptimizations, instruction fetch optimizations and time optimizations.The data access and instruction fetch optimizations prevent the hardwaresimulator from seeing bus activity during bus-cycles it is notinterested in. The hardware simulator is however still advanced in time.The time optimizations effectively stop the hardware simulator seeingclock changes during bus-cycles it is not interested in, this enablesthe ISS to be run for many cycles at a time without context switching.

[1286] For a cycle-based simulator like Handel-C, the data access andinstruction fetch optimizations would make no difference, the simulatorhas just as much work to do whether it sees changes on the bus or not.The time optimizations would make a difference. Knowing when it is safeto use time optimizations is not easy, Seamless CVE allows the user toenable or disable time optimizations according to how the memory isbeing accessed, whether this still maintains simulation accuracy is leftto the user to decide. If it were possible to automatically know when asimulator had reached a stable-state then the time optimizations couldbe made more reliably and generically, rather than relying on CPU memoryaccess to give hints. Possibly a user could modify their design to tellthe cosimulation tools when their design had reached a stable-state andwould only need to see another clock change when something else changedin a relevant way as well.

[1287] Cosimulation via OMI

[1288] The Object Model Interface is an interfacing standard (IEEE1499), which enables models written using one tool to be incorporatedinto another. Tools known as Model Compilers or Model Packagers take aVHDLNerilog/C description of a model and turn it into object code whichis OMI compliant and can be imported into other simulation tools. OMIprovides a means for IP vendors to provide simulation models of their IPwithout giving away the source code. By being an open standard OMI alsoincreases interoperability.

[1289] OMI was created by the Open Model Forum. Synopsys, MentorGraphics and Cadence were all involved in the process. OMI combines twoAPI's, a simulator-API and a model-API. The simulator-API is based onSWIFT from Synopsys and deals with interfacing models to the rest of asimulation. The model-API is based on a proposal from Cadence and isconcerned with the internal workings of the models. Cadence productsalready support OMI, several other vendors have pledged to support itsoon. Use of SWIFT is currently more common than OMI, but SWIFT maybecome a legacy standard.

[1290] OMI is a relatively complex standard and supporting it would be asizeable undertaking. It doesn't provide specific support forcosimulation and would quite possibly be a hindrance in optimizingcosimulation to run faster. It would be possible to involve OMI modelsin a cosimulation, but to use OMI interfaces as the sole means ofcommunication in cosimulation would most likely be overly restrictive

[1291] Cosimulation with SystemC

[1292] It would be possible to use models written in SystemC in acosimulation arrangement. It may also be possible to use SystemC as themeans of integration of a number of models not written in SystemC.However as with OMI restricting integration to just that that can beachieved via SystemC is likely to be overly restrictive. Although thesource to SystemC is freely available, modifications to it can only bedistributed back to the SystemC committee. So improving SystemC tobetter support cosimulation of non-systemC models is not likely to bepossible.

[1293] Proposed Architecture

[1294] Two categories of simulation models are proposed, light-weightthread-sharing models and heavy-weight process-hogging models. They eachhave the following characteristics:

[1295] Light-weight thread-sharing models:

[1296] Implemented as a dll.

[1297] No blocking functions.

[1298] Able to instantiate and interconnect sub-models.

[1299] Heavy-weight process-hogging models:

[1300] Implemented as a separate process, requiring IPC forsynchronization/communication.

[1301] Not able to instantiate sub-models.

[1302] Light and heavy refer to the communication overhead of usingthese models, and not the simulation overhead of the models. Lightweight models would typically be used for implementing very simple gluelogic, clocks and optimization logic (see later). Heavy weight modelswould typically be used for wrapping up existing simulators which have aplug-in interface. However there would be no disadvantage in acomputationally intensive model using the light-weight model interface,in fact it would give the advantage of added flexibility when decidingwhich computations should be performed in which processes.

[1303] Execution of the cosimulation environment would consist a numberdistinct stages: instantiation, analysis and simulation. Instantiationbegins with a single root model, which would typically be a light-weightmodel which instantiates and connects up other models. This root-modelcould be a simple ‘C’ program or an elaborate GUI which allowed the userto interactively instantiate and connect up models. The modelsinstantiated have a hierarchical relationship, there is no global namingpolicy for the ports on a model. Models are only able to communicate viaports they have been given by their parent or children.

[1304] After initialization is complete an automatic analysis stagebegins. The hierarchical relationship becomes irrelevant and theinterconnections between models are analyzed as an unstructured network.At this stage the cosimulation tool builds up any structures it may needat simulation time. Optimizations like static scheduling decisionsbelong here.

[1305] During simulation dynamic scheduling decisions are made,process-hogging models are synchronized with each other, communicationbetween models is handled.

[1306] Processes and DLLs

[1307]FIG. 44B illustrates a cosimulation arrangement 4462 includingprocesses and DLLs. The present figure shows three processes 4464, eachprocess contains a program 4466 and a number of dlls 4468.

[1308] The Cosim HQ program starts everything off. It starts off theroot model which is a light-weight model existing as a dll in the sameprocess as the Cosim HO program. This model then instantiates andconnects other models. Other light-weight models are simply loaded intothe same process.

[1309] Starting up process-hogging models is a little more involved. Fora light-weight model to instantiate a process-hogging model, thelight-weight model may know the name of a simulator specific launcherdll. This name is passed to Cosim HO which gives the launcher dlldetails of how IPC is to be achieved. The launcher dll then loads up thesimulator which may at some point load up the simulator specific cosimplugin, which loads up a generic cosim dll. The simulator specificlauncher and cosim plugins may cooperate is passing the IPC connectioninformation from Cosim HO to the generic cosim dll. Once this has beenachieved communication between the two processes can take place. Thetechniques described here avoid the simulator specific plugins needingto know how IPC takes place, and avoids the cosimulation program needingto know how to start up and pass parameters to every different kind ofsimulator.

[1310] Communication may take place between processes on one machine, orbetween multiple machines across a network, only cosimulation specificcode needs to be concerned with this. The simulator specific codeneedn't be concerned.

[1311] Similarly any mechanism may be used to pass connection detailsfrom the launcher dll to the generic cosim plugin, such as command linearguments, environment variables, shared memory, files or whatever, andonly the simulator specific code needs to know about it, not thecosimulation code.

[1312] Light-Weight Models

[1313] Light-weight models can be used for models which arecomputationally cheap and which one wants to keep isolated from othermodels. For example a clock, one wouldn't want a separate process justto contain a clock model, but he or she wouldn't want to have toarbitrarily pick another model in which to put the clock, as this wouldhinder interoperability between models. Light-weight models can also beused for optimizations such as preventing a hardware simulator seeingthe clock when a CPU is doing something unrelated to the hardware.

[1314] Light-weight models needn't exist in the same process as theCosim HO program. The Cosim HO program and the generic cosim dlls mayconspire between them to achieve the desired execution order in anywaythey please. One could migrate light-weight models out to otherprocesses. For example if an ISS is able to simulate many cycles withouta hardware simulator being involved, it would be desirable for the clockgeneration code to be in the same process as the ISS. If the genericcosimulation dll is clever enough then communication between theprocess-hogging simulators and the cosim HO may be reduced or eliminatedaltogether, thus reducing the number of context switches. Each processloading the generic cosim dll may become capable of direct communicationwith other simulators, communication needn't go via the cosim HO.

[1315] Optimizations

[1316] Light-weight models can be used to shield a hardware simulatorfrom details it doesn't need to see. If a light-weight model is placedbetween ISS and hardware simulator, then with some configuration thelight-weight model can use address decoding to determine whether thehardware simulator needs to run or not. Knowing when its safe to notclock the hardware simulator is application specific. A pathologicallyunoptimizable example would be using hardware to profile a CPUsactivity, in most cases though significant optimizations should bepossible.

[1317] Application Programming Interfaces Exchanging Interfaces

[1318] The interfaces between programs and dlls are defined by a numberof header files. There may be a number of interfaces between a givenprogram-dll/d li-dil pair. Each program or dll provides a mechanism bywhich an interfacing program/dll may request access to a namedinterface. Before an interface may be requested though, the mechanismsby which interfaces are obtained are exchanged between the communicatingprogram/dlls.

[1319] typedef void* GetlnterfaceT(void* state, char* ifname);

[1320] intExchangelnterfaces(GetlnterfaceT*,void*,GetInterfaceT**,void**)

[1321] The dll being loaded implements Exchangelnterfaces the initiatingprogram Idll calls Exchangelnterfaces with a function which the dllbeing loaded may call to obtain interfaces. It also passes a voidpointer which should be passed to the Getlnterface function whenever itis called. This void pointer may point to anything the initiatingprogram Idll wants, including NULL if the initialing program Idll has nouse for it. The initiating program Idll also receives back acorresponding GetInterface function and associated void pointer.

[1322] Accessing interfaces by name makes it possible to add newinterfaces and support multiple versions of an interface. If interfacenames were ever to be created outside Celoxica, the names couldincorporate GUIDs (Globally Unique IDentifiers) but this seems unlikelyto be necessary.

[1323] Interfaces

[1324] For interfacing between models there are three kinds ofinterface:

[1325] Init—for initialization and termination

[1326] CommSync—for communication and synchronization

[1327] Control—for cross-model breakpointlstop/start control

[1328] These interfaces are implemented for each of the three modeltypes:

[1329] Light

[1330] Event

[1331] Cycle

[1332] Each interface has two sides a simulator side and a cosimulationside. Also there is an interface for using the launching dlls. Thisgives a total of 19 interfaces. Each interface has a structurecontaining function pointers to the functions that interface maysupport. To implement an interface the programmer may create an instanceof the required structure. The 19 interface structure types are listedhere:

[1333] Init-CoCycle-IFT

[1334] Init-SimCycle-IFT

[1335] CommSync-CoCycle-IFT

[1336] CommSync-SimCycle-IFT .Control-CoCycle-IFT

[1337] Control-SimCycle-IFT .Init CoEvent IFT

[1338] Init SimEvent IFT-

[1339] CommSync-CoEvent-IFT

[1340] CommSync-SimEvent-IFT .Control CoEvent IFT

[1341] Control SimEvent IFT-

[1342] Init-CoLight-IFT

[1343] Init-SimLight-IFT

[1344] commSync-CoLight-IFT

[1345] CommSync-SimLight-IFT.Control_SimLight_IFT

[1346] Launch SimProcess IFT

[1347] The functions defined in these interfaces are detailed in theheader files: cosim-light .h, cosim-event.h, cosim-cycle.h,copsim-launch.h. If the ability to simultaneously save and restore stateacross a number of simulators is to be implemented then furtherinterfaces may be defined.

[1348] Datatypes

[1349] Initially this embodiment may only support 2 and 4 valued logicvalues. When ports are declared the they may have a type associated withthem. These types are represented by abstract C values, these are eitherpredefined e.g. hitType , logic4Type, logic9Type, int64Type , int32Type,int16Type, int8Type, realType, douhleType. Also there are a number offunctions enabling the user to create vector types e.g. mkBitVectorT(uint), mkLogic4VectorType (uint), mkLogic9VectorType (unit). Finally ifthe user wishes to use another type altogether they may create their owntype with the function userType (char* name, int size), so long as otherparts of a cosimulation arrangement agree on how large this user typeis, the cosimulation tool may allow them to do what they like with dataof this type.

[1350] Values are given the abstract type ValueT. This is a voidpointer, for bit-vector types it may point to a memory locationcontaining bits packed into bytes, i.e. a 32-bit long bit-vector mayjust be 4 bytes in memory. For 4-valued logic vectors, ValueT may pointto a Logic4 VectorT struct containing two more pointers hitKind andhitValue. hi tKind and hitValue each point to bits packed into bytes inmemory for a given bit location the values in hi tKind and hi tValuedetermined the 4-valued logic value as follows in Table 2. TABLE 2HitKind hitValue 4-valued logic 0 0 Z 0 1 X 1 0 0 1 1 1

[1351] This enables very quick checks to be performed to see if anentire logic-vector consists of Os or Is, or to check if an entirevector is in a HiZ state. This is useful as typically a bus may eitherbe fully driven or fully floating. (The implementation of SystemC makesthis sort of check a much slower process). The header file cosim-types.h contains the type declarations and function prototypes for declaringand using types in cosimulation.

[1352] When converting from 4-valued logic to 2-valued logic one havesome freedom in converting X and Z values. Options include alwaysconverting them to 0, converting them to the previous value so as tominimize events, and converting them to a random value in order tostress test a model. Or one could consider an attempt to read an X or Zto be an error, and flag it at run-time.

[1353] Initialization

[1354] Cosimulation always starts off with one root model. As only lightmodels can instantiate child-models the root model may be a light modelif any more than one model is to run. During instantiation a model maycreate ports of any type and declare dependencies between these ports.Once a child model is instantiated, the parent may examine which portsthe child created, and may then connect the ports to any other(type-compatible) ports.

[1355] Simulation

[1356] After the hierarchy of models created during initialization havebeen flattened out to a non-hierarchical network, simulation can begin.Cycle based models call functions in the CommSync interface to read andwrite ports when ever they want, synchronization is achieved by blockingthe returns of these functions calls. Event based simulators, outputwhen ever they want, and request to be informed of input events whenthey are ready for them. Light-weight models are implemented asevent-based models, and no functions are allowed to block. Simulatorsare able to register wake-up calls for simulating internally timedlogic, the simulators may be woken up earlier if another simulatortriggers off an event.

[1357] Launching

[1358] It is the responsibility of the programmer integrating asimulator with the cosimulation tool to write a launch dll. This dllwould typically startup a new simulator process but it doesn't have to,it could pick an already existing simulator from a pool if idlesimulators. If a simulator disconnects from a cosimulation arrangementearly, then the launch dll may be called in the middle of simulation toresurrect the disconnected simulator. This resurrection would benecessary in situations like a user resetting a Handel-C program.Handel-C terminates all plugins and then restarts them when resetting aprogram. For this not to have adverse effects on the cosimulation, thecosimulation tool may allow a simulator to disconnect and reconnect aslong as it declares just the same ports with the same names, types anddependencies. One could use the dynamic relaunching as a means ofhot-swapping simulators, but that's not what it's meant for.

[1359] The launching dll should assume it is to start the simulationprocess on the same computer it is running on. If the Cosimulation toolwishes to run a simulator on another host, the cosimulation tool mayitself be responsible for running the launching dll on the remote host.The launching dll may be given connection info which should be passed onvia the simulator and the simulator specific plugin to the generic cosimdll, which may understand the connection info and establish a connectionback to the cosim tool over the network, and possibly other genericcosim dlls on other hosts.

[1360] Alternatives

[1361] Instead of allowing child models to declare whichever ports theywant, and have the parent model figure out how to wire the ports up, onecould have the parent declare a number of signals and pass these tochild processes. By passing the same signal to more than one child modela connection would implicitly be made. It would then be theresponsibility of the child to check the signals passed in wereappropriate. Declaring signals first is less suited to an interactivegraphical instantiation and connection tool. A user would probably findit easier to instantiate a model and see which ports they got back,rather than having to correctly predict which ports a child model maywant. One could provide both techniques together. SystemC allows signalsto be passed into and ports to be passed back from a model beinginstantiated, CynApps only allows signals to be passed into a model. Itsprobably best to stick to the relatively simple technique of allowingchild models to create which ever ports they like, until advantages ofenabling both techniques are found in practice.

[1362] SystemC allows models to be implemented in either a non-blockingway similar to the light-weight models described here, or to usecooperative non-preemptive multi-threading to allow multiple models toexecute in a relatively light-weight manner without OS calls. This kindof multi-threading may make it easier to write more complex light-weightmodels, however apparently it makes execution slower. This kind oflight-weight threading may be worth supporting if people outsideCeloxica are going to write moderately complex light-weight models.

[1363] Further Work

[1364] There are three different levels of user for this cosimulationtool:

[1365] People integrating new simulators

[1366] People writing light-weight models

[1367] The final users of a cosimulation arrangement

[1368] Documentation needs to be provided for each of these types ofuser. The documentation for the final users may contain simulator andarrangement specific parts. Different kinds of optimizations need to beexperimented with. Optimizations in other cosimulation tools have arisenout of necessity following experiences with simulation that just run tooslow.

[1369] Cosimulation Algorithms and Programming Interfaces

[1370] This section explores different algorithms that could be used forcosimulating any number event-based and cycle-based simulators and theimplications this has on the programming interfaces used. The presentsection considers three types of simulator:

[1371] Event-based, such as ModelSim

[1372] Cycle-based synchronous, such as SingleStep and ARMulator wheresimulation of asynchronous logic is not performed and cycles cannot berepeated

[1373] Cycle-based asynchronous, such as Handel-C and probably otherCycle-based simulators such as Cyclone (Synopsys HDL simulator), hereasynchronous logic can be simulated, and simulation cycles can berepeated as necessary.

[1374] If cosimulation with simulators which simulate asynchronouslogic, but don't allow cycles to be repeated is required, then somecosimulation arrangements may be unsimulatable, it may be necessary togive compile-time or run-time errors in these circumstances. Allsimulators may either only simulate untimed logic or may provide a meansby which a cosimulation plug in can find out when the next event is dueand provide earlier stimulus if necessary.

[1375] These different types of simulator may be wrapped up so as toenable communication between different simulators. This wrapping maymake each simulator look like an event based simulator and may containadditional information and interfaces to help in scheduling simulatorexecution.

[1376] Scheduling Event-Based Simulators

[1377] Wrapping up event based simulators to look like event basedsimulators is relatively easy. Issues involve propagating input eventsand detecting output events. It doesn't appear to be possible for aplugin to instruct ModelSim to process all current events withoutadvancing simulation time. Advancing simulation time by a very smallamount is one solution to this, so long as repeated simulation doesn'tresult in these small amounts adding up to something significant.ModelSim can be instructed to call callback routines whenever a signalchanges.

[1378] Scheduling Cycle-Based Synchronous Simulators

[1379] Cycle-based synchronous simulators (such as an Instruction SetSimulator(ISS)have a very fixed idea of the order in which evaluationshould proceed. Fortunately as they do not simulate asynchronous logicit is never necessary to request such a simulator to resimulate a cycle.Cycle-based synchronous simulators are sensitive only to activeclock-edges, all other changes can be ignored. Wrapping such a simulatorup as an event-based simulator is straight forward.

[1380] Scheduling Cycle-Based Asynchronous Simulators

[1381] There are a number of different ways for execution of acycle-based asynchronous simulator to proceed. Here one can explore somedifferent scheduling policies.

[1382] Ideally when wrapping such a simulator up as an event-basedsimulator the clock input shouldn't be treated as a special case. Asimple approach would be to wait for an input event to arrive, and thenadvance the simulator far enough for the effects of the input change topropagate to all dependent outputs. If there are no current input eventspending then advance simulation time, until the next future event isscheduled, this may typically cause a clock input to a cycle-basedsimulator to change, but in general it could be any input.

[1383] ASAP I Eager Simulation

[1384] One need not know which outputs depend on which inputs, one canbe conservative and assume all outputs depend on all inputs. When asimulator gets the chance to run again, it can check to see if anyinputs have changed and if so advance far enough for all outputs to beupdated.

[1385] This approach to simulation makes no assumptions about the orderin which the cycle-based simulator gets and sets inputs and outputs, itmakes no assumptions about the dependencies between inputs and outputs.It does not require the concept of a start of simulation cycles and theend of a simulation cycle. As each outputs is recomputed by thesimulator, one can check to see if it has changed, and if so propagatethe effects to other simulators. The order in which the simulatorsexecute is not too critical. One could run just one at a time, or allsimultaneously.

[1386] Simulation in Turns

[1387] If running just one simulator at a time, all simulators but onewould be stopped using OS-level wait operations, just one would proceed.When finished one can check if any other simulators need to execute, ifso pick one arbitrarily to go next, otherwise advance simulation time.

[1388] Simultaneous (Multi-Processor) Simulation

[1389] If cosimulating two low-computation/high-communication simulatorson a multiprocessor system then one could get away with fewer OS-levelcalls. One could have a simulator running on each processor. Nosynchronization would be needed for passing word sized data between thesimulators. For larger data transfers, busy-wait mutual exclusiontechniques would be an efficient mechanism for maintaining dataintegrity. Each simulator would loop as fast as it liked until none ofits inputs changed, then it would use an OS-level wait function to waitto see if any of the other simulators subsequently changed the inputs.

[1390] When all simulators reach this waiting state then simulation timecan advance, typically causing a clock signal to change. Semanticimplications of evaluation order These two techniques could result indifferent results being computed depending on the order in whichsimulators execute. For example if one simulator is going to change twooutputs from (1,0) to (0, 1), and another simulator is going to ANDthese two values together, the order in which the two simulators readand write these values may affect the result. The output of the AND maypulse high for an infinitesimally short length of time, or it might not.If some circuit counts these pulses then the implication could compound.These problems could only occur in badly designed circuits, the issuesinvolved are inherent in true hardware as well and so may be in anysimulation of it. (VHDL is able to claim to have precisely definedsemantics by dictating what is computed when. However this results inwhat might be thought of as semantics preserving transformations such assplitting a signal in two, not being semantics preserving. Again this isonly an issue for badly designed circuits).

[1391] Just-In-Time I Lazy I Interleaved Simulation

[1392] Busy waiting might be worth while when one has at least as manyprocessors as simulators wishing to busy wait, and one doesn't want touse the computer for anything else at the same time, but for mostcircumstances it would be unsuitable.

[1393] The simulation-in-turns approach while simple and general couldresult in much more work being done than required. FIG. 44C illustratesan example of a simulator reengagement 4470, in accordance with oneembodiment of the present invention.

[1394] These two blocks represent hardware simulated by two connectedcycle-based asynchronous simulators 4472. The dashed lines representasynchronous logic, although at the cosimulation level one may not knowwhere the asynchronous logic is. If one uses a simulation-in-turnsscheduling policy then one updates all outputs from simulator 1 and thenupdate all outputs from simulator 2. If it is assumed that eachsimulator reads and writes their inputs and output in the order A, B, C,D, E, then the input B to simulator 1 may change after both simulatorshave simulated one cycle, so another simulation cycle of simulator 1 isperformed, which triggers another simulation cycle in simulator 2 and soon. In all each simulator has to repeat the same simulation cycle threetimes.

[1395] In the example above it seems obvious that each simulator needonly simulate each cycle once, one just need to use a finer level ofsynchronization. However it's not always the case that each simulationcycle need only be performed once. If the inputs and outputs of theasynchronous logic was fed to a device which was being clockeddifferently then it may be necessary to repeat a simulation cycle.Instead of repeating a simulation cycle every time an input changes, onecan delay calculating outputs until the output is required. This enablesone to ignore changes to the inputs if no one is going to read theoutputs. This is safe as long as the asynchronous logic is non-cyclicand is thus unable to form latches or registers, if registers existed inthe asynchronous logic then the logic could count the number of times aninput changed, however this falls within the realms of badly designedhardware.

[1396] In the course of simulating one cycle, an input could change:zero times, once or many times. There's little point waiting for anyinput change before allowing a simulator to advance, a better schemewould be to wait until an output is required before advancingsimulation. Simulation output is required whenever time advances oranother simulator wishes to read the simulator's output. When an outputis required and new inputs have arrived since the last time that outputoccurred, the simulation is allowed to proceed to the point where thatoutput is produced.

[1397] In the above example evaluation proceeds in the following order:the clock changes, this invalidates outputs from the simulators, logicbetween the clock and all outputs is assumed, (if there were no suchlogic, that is if the outputs were purely dependent in inputs and notregisters, then evaluation would proceed in the same order but forslightly different reasons). sim1 advances past outputting A and blockson reading B, there's no point in delaying outputting A as it may be thesame however long one waits, but it may be worth while delaying readingB to avoid reading in a value which is going to change. Sim2 blocks onreading A, until simI attempts to read B (if siml has already reachedthis point then sim2 doesn't block). Once siml is blocked on reading B,and sim2 is blocked on reading A, sim2 is allowed to proceed until ittries to read C. The key here is that simulators may be suspended whiletrying to read input until the input is upto date. An input is out ofdate if it was produced by a simulator that has received new input morerecently that it produced the output. If only one simulator is trying toread upto date input, that simulator proceeds, if more than onesimulator is trying to read upto date input, then one could pick one orboth to proceed. If all simulators are trying to read out of date input,there may be some asynchronous cyclic logic, one may pick one simulatorto proceed, some asynchronous cyclic logic can be used in a well definedmanor where race conditions don't apply, if it is then which simulatorgoes first doesn't matter, otherwise one has another case of badlydesigned hardware, and the output in practice as well as in simulationwould be unpredictable.

[1398] So far we've assumed that within one cycle, all outputs aredependent on all inputs. Assuming the outputs are depend on all inputsmay be overly cautious, and may force more simulation cycles to berepeated than necessary. If the cosimulation API were able to capturedetails of such dependencies then the need to repeat simulation cyclescan be more accurately calculated.

[1399] Cosimulation Programming Interface

[1400] The information required by a cosimulation backplane to correctlyschedule simulators include:

[1401] Type of simulator: event based, cycle-based synchronous,cycle-based asynchronous.

[1402] Dependencies between inputs and outputs in models (optional)

[1403] The optional items may help more accurately calculate whensimulation cycles need to be repeated, but an approximation can be usedif the optional info is unavailable.

[1404] Its also necessary for the cosimulation backplane to know whathardware interfaces are being modeled by a simulator. For a hardwaresimulator the hardware interfaces being used could be almost anything,even for an instruction set simulator there is some configurability,such as bus widths and interrupt interfacing methods. There are two waysin which this information could be used by a cosimulation backplane:statically or dynamically. The implication of this is that when writingcode used by a cosimulation backplane to indicate how the simulatedmodels are connected together, one could either have details of themodels hardware interfaces checked at compile-time or run-time.

[1405] Compile-time checking would require automatic generation of CIC++header files from various simulator plugins, this scheme has the benefitthat coding mistakes resulting in hardware interface mismatches arespotted earlier, it wouldn't however result in faster simulation, sinceit may still be necessary to check the actual hardware interfaces usedby a simulator are the same as the ones expected by the cosimulationbackplane. A static hardware interface connecting approach may result insyntactically nicer code as actual CIC++identifiers and struct namescould be used and not just names in strings to be connected up later.

[1406] Using a dynamic approach to hardware interface connections wouldremove the need for automatic GIG++ header file generation, allinterface names would be stored in strings and checked for validitylater. A dynamic approach would also be more suitable if thecosimulation backplane is to be configured using a GUI and not a CIC++program. The whole issue of how one starts up different simulators islikely to be a matter of personal taste, its probably best that thecosimulation API doesn't prohibit any mechanism, either by supporting anumber of startup techniques or by being neutral to the issue.

[1407] Cosimulation User Documentation

[1408] The present section explains how to use the cosimulation serverprogram, and how to use the client library.

[1409] Cosimulation Architecture

[1410]FIG. 44D illustrates a schematic of exemplary cosimulationarchitecture 4480. Cosimulation is split into two parts: a client 4482and a server 4484. The server co-ordinates the allocation ofsynchronization points (or sync-points) and shared memory. The clientsare the simulators one may want to use in cosimulation with pluginsusing the cosimulation client library. To start cosimulating, first thecosimulation server may be started, then clients may start and finish,allocate and deallocate cosimulation resources asynchronously withrespect to each other. Typically a cosimulation client may first make aconnection to the cosimulation server, then it may register anysync-points it wishes to use to synchronize with other simulators, andattach any shared memory it wishes to use to share data with othersimulators. The simulators may then communicate via the shared memoryand synchronize using the sync-points before detaching from the server.

[1411] Data Types

[1412] The following data types are used in the cosimulation clientlibrary:

[1413] typedef void CosimConnection;

[1414] typedef void SyncPoint;

[1415] typedef void (*CosimErrorHandler) (char* error);

[1416] CosimConnection and SyncPoint are actually structs but the userof the cosimulation client library may only be dealing with pointers tothem, CosimErrorHandler is used to register an optional error handler.

[1417] Connections

[1418] CosimConnection* CosimConnect(char* servername,CosimErrorHandlererrorHandler);

[1419] This function establishes a connection from the client to theserver.

[1420] servername

[1421] Specifies the name of the server, if null is passed “CosimServer”is used.

[1422] errorhandler

[1423] Specifies a function the clients library functions should callwhen an error occurs. The error handling function is passed a textstring explaining the error. When the error handling function returns,the cosimulation library may terminate the process. If a null value isgiven a default error handling function is called which pops up amessage box explaining the error.

[1424] return

[1425] Returns a pointer to the opaque CosimConnection structure.

[1426] void CosimDisconnect(CosimConnection* connection};

[1427] This function closes a connection from the client to the server.Any cosimulation resources (e.g. sync-points and shared memory) thathave been allocated but not explicitly deal located may be automaticallydeal located when the client disconnects from the server. [the servermay automatically clean up if a client terminates without disconnectingfirst, this prevents one crashed simulator bringing the remainingsimulators to a stand-still]

[1428] Connection

[1429] The pointer returned by CosimConnect

[1430] Synchronization Points

[1431] Sync-points enable a number of simulators to synchronize witheach other at various points. When a number of simulators all wish tosynchronize at a certain point, the desired effect is that none of thesimulators proceed past that point until all the simulators concernedhave reached that point. Not all simulators have to synchronize at once,one can have only a subset of the simulators synchronizing. For asimulator to synchronize it may first register interest in a sync-point.When synchronization on that sync-point is desired all the simulatorswhich registered the sync-point may call CosimSync with that sync-point,only when they have all called this function may the function return.During registration sync-points are identified by integers, theseintegers would typically be defined by an enum in a common header file.[If the cosimulation becomes deadlocked, for example by twointerdependent simulators blocking on different sync- points, thecosimulation server may report a deadlock, this indicates a bug in theuse of the cosimulation client library]

[1432] SyncPoint* CosimRegisterSyncPoint(CosimConnection* connection,int syncpointld);

[1433] This function registers a simulators interest in a particularsync-point.

[1434] connection The pointer returned by CosimConnect

[1435] syncPointld For two simulators to synchronize at some point theymay both register

[1436] SyncPoints with the same numeric id, these ids would typically bedefined by an enum in a shared header file.

[1437] return Returns a SyncPoint pointer. This pointer is used in callsto CosimSync.

[1438] void CosimUnregisterSyncPoint(CosimConnection* connection,SyncPoint* syncPoint);

[1439] This function is used by a simulator to unregister sync-points,unregistering sync-points is handled automatically when CosimDisconnectis called [and also when a simulator crashes], so calls to this functionare not typically needed.

[1440] Connection The pointer returned by CosimConnect

[1441] SyncPoint The pointer returned by CosimRegisterSyncPoint

[1442] void CosimSync(SyncPoint* syncpoint);

[1443] This function is called by a simulator when it wishes tosynchronize with all the other simulators which registered thissync-point. Until all the simulators which have registered a particularsync-point call this function with that sync-point, none of the callsmay return.

[1444] syncPoint The SyncPoint pointer returned byCosimRegisterSyncpoint

[1445] Shared Memory

[1446] Functions are provided to assist in sharing memory betweensimulators. Simulators may attach and detach shared memory. Whenattaching memory the memory is identified by an integer. This integerwould typically be defined by an enum in a common header file. Whendifferent simulators attach to memory using the same memory identifierinteger, they gain access to the same shared memory. The cosimulationserver issues a warning if the same memory is requested but withdifferent sizes. Typically detaching is unnecessary as all resources aredeal located automatically when a simulator disconnects from thecosimulation server [and when any simulators crash].

[1447] So long as at least one simulator has a given piece of memoryattached, that memory is available to be shared by other simulators,when no simulators have a given piece of memory attached that memory islost, and new requests for memory by the same memory identifier integermay result in new memory being allocated, possibly with a differentsize.

[1448] void* CosimAttachMemory(CosimConnection* connection, unsignedmemld, unsigned size);

[1449] This function attaches a simulator to shared memory identified bythe integer memld.

[1450] connection The pointer returned by CosimConnect

[1451] memld An integer used to identify a piece of shared memory

[1452] size The desired size of the shared memory

[1453] return A pointer to the shared memory

[1454] void CosimDetachMemory(CosimConnection* connection, void*memPtr);

[1455] This function detaches a piece of shared memory from a simulator.Calling it is typically unnecessary as shared memory is automaticallydetached when CosimDisconnect is called.

[1456] Connection The pointer returned by CosimConnect

[1457] MemPtr The pointer returned by CosimAttachMemory

[1458] Cosimulation Server

[1459] The cosimulation server is a command line program which takes oneoptional argument, the name of the cosimulation server. This namedefaults to “CosimServer”. By specifying a different name, the multipleinstances of the same cosimulation environment can be run at the sametime without interfering with each other. A maximum of 63 clients mayconnect to one cosimulation server. The cosimulation server may warn ifsimulators try to attach the same piece of shared memory but specifydifferent sizes for that shared memory.

[1460] Multithreading

[1461] The CosimConnection pointer may be passed between threads withinthe process that called CosimConnect but not between processes. It isnot safe in general to use the same cosimulation connection in two callsof cosimulation client library functions at the same time, multipleconnections from the same process may be established.

[1462] SingleStep/Handel-C Integration Possibilities

[1463] Using the SingleStep MMK interface its possible to have Handel-Cmodel a memory mapped device, raise interrupts, operate in a DMAfashion, and as a coprocessor communicating via special processorregisters. It's also possible to override any SingleStep implementationof MMUs, Caches and Bus Interface Units.

[1464] Cosimulating by keeping two simulators running in lock-stepprovides a clock cycle accurate simulation of a CPU and FPQA. Also, itenables unusual things like non-invasive profiling of the CPU to seewhich instructions and memory are most heavily used. A custom-madememory management unit may also be enabled.

[1465] Improvements that Would Make Simulators More Amenable toCosimulation

[1466] As an option, it may be one object of the present invention toprovide an interface which would enable information about and control ofa simulators debugging interface. Before starting a cycle, a debuggershould check with all other debuggers that they do not wish to break onthis cycle. If one breaks, they all break.

[1467] When the user instructs one simulator to resume execution, allshould resume execution. This is slightly more complex to achieve. Asimple approach would be to have each debugger, when suspended, poll aplugin function every 0.1 seconds to see if execution should be resumed.The figure, 0.1 seconds, is a compromise between user interface latencyand wasting CPU cycles. A more responsive, less wasteful but morecomplex solution would be to have each debugger support some form ofasynchronous interaction with their plugins. Such asynchronouscommunication could be achieved by having plugins spawn a new threadwhich is permitted to send a Windows user message to the debugger,indicating that some plugin function should be called. Thus messagesfrom the plugin can be received in the same queue as QUI messages. It isprobably also possible to have the debugger simultaneously wait forwindows messages in a queue and wait for an event to become signaled.

[1468] As an option, it may also be an object of the present inventionto decouple a debuggers QUI from its simulator back-end. This would makefurther architectural changes easier, such as placing multiplesimulators in the same process and using inter-process communication tocommunicate between each simulator and its respective QUI. Forsimulators with a high degree of communication relative to computation,placing the simulators in the same process would be helpful. A suitabledecoupling of QUI's from simulators would enable a variety ofcosimulation arrangements possible. Another possible arrangementsuitable for multi-processor machines would be to use busy waiting tosynchronies simulators. So long as one has as many processors as onedoes simulators, execution may proceed faster then using OS basedsynchronization primitives. The point here really is that manycosimulation architectural arrangements are possible. A suitabledecoupling of QUI from simulator may prevent being tied to anyonearrangement and enable a variety to be used as future environmentsrequire.

[1469] As an option, it may be still another object of the presentinvention to make a simulator suitable for a wide variety ofcosimulation architectures. The simulator should in effect be turnedinside-out. That is, the simulator should provide a number of functionswhich each return as quickly as possible, these functions are thencalled by a host program which may be responsible for ensuring asuitable order of evaluation. It would be the host program that would beresponsible for integrating two simulators in one thread. The hostprogram may also use multiple threads and communicate either viaOS-based synchronization primitives or busy waiting. The simulatorswould have no awareness of the environment in which they are executing,and thus could be used in a variety of environments. If a simulatorwishes to wait for something such as user interaction or a networkconnection, it should do so in a non-blocking fashion by telling thehost program (via a return value or call-back). This enables multiplesimulators to be waiting at the same time.

[1470] Generic Cosimulation Architecture

[1471] As a further option, it may be an object of the present inventionto link a number of simulators together in a generic way. Such anability could be provided via a programming interface and/or a GUI. Anumber of issues are involved here. Two important issues includeidentifying what hardware a simulator is simulating, and ensuringexecution proceeds in a suitable order. Most simulators are capable ofsimulating a number of hardware components. Even something asspecialized as a microprocessor simulator can model processors whichhave different bus widths or a different number of interrupt lines. Thismeans one may not know until the simulators are running that they arecompatible with each other. For example, one may assume a bus is 16-bits wide and another mayassume the bus is 32-bits wide. Having ameans to automatically determine the external characteristics (such asbus widths) of a hardware model is desirable. This may or may notrequire the execution of the simulator.

[1472] It should be noted that enabling a plugin to load up a differentprogram in a processor/net list in an FPGA could also be useful.

[1473] Cosimulation via SystemC

[1474] It may be possible to cosimulate using SystemC for joining upsimulators. However, there are a couple of issues which can't beanswered without looking further into the implementation of SystemC. Afirst issue involves whether the time at which SystemC verifies that thecomponents being plugged together are compatible. If this is atcompile-time, new C++code may have to be generated for differenthardware models, e.g. processors with different bus-widths. Also, onedoesn't have much control over the order in which SystemC evaluatesthings. It may be desirable to modify SystemC to improve this matter.Unfortunately, SystemC licensing restrictions prohibit the distributionof modified SystemC code other than back to the SystemC committee.

[1475] Hardware emulators support

[1476] It might be worth looking into co-emulation That is, it may beuseful to consider using a processor and an FPGA on PCI boards. This maybe accomplished using either the same or different boards. BothSingleStep and ARM Developer Studio provide a similar environment formonitoring the state of emulation hardware as they do for monitoringsimulation software.

[1477] Clearly, the majority of the work would be on Handel-C. If theCPU and FPGA are communicating over the PCI bus instead of being on thesame board or communicating via a dedicated link between the two boards,it may be even more beneficial to enable either the FPGA or the CPU torun ahead of the other. While it may be possible to implement a MMU inan FPGA, one doesn't want to restrict simulation/emulation speed tolock-step speed just in case it might be needed. A hybridsimulation/emulation environment would also be a possibility.

[1478] Specific Improvements to SingleStep

[1479] It may be another optional object to provide better support forconsole interaction, and have a debugger jump to the front when abreakpoint is reached. This would provide better interaction when amemory access keeps returning MMKR-NOT-READYiO.

[1480] It may also be desirable to provide a call-back or mmk-accessreturn value which enables a mmk plugin to indicate the debugger shouldbehave as though it has reached a break point. The MMK plugin should beable to instruct the debugger to break on any clock cycle, even those inthe middle of a multi-clock cycle instruction.

[1481] One may also wish to make the GUI respond to a custom Windowsuser message which tells it to call a MMK plugin function. This functionmay then tell the debugger to advance one clock, or advance over nextassembler instruction/C line of code.

[1482] Plugins may be equipped with the ability to change the symboltable used by single step. In settings where an FPGA is going to changethe program code of a processor, it may be useful to have the C sourcecode reflect that change.

[1483] SingleStep has support for Multi-Tasking Debugging (Mill). Thisgives SingleStep awareness of an operating system. An OS specificlibrary may be used. There also exists a Mill Library Kit which enablesone to provide support for any OS. The SingleStep for MCore Targetsmanual gives partial documentation of the MTD Library Kit. It providescall back function which enables a library to call any command-linecommand. By issuing commands such as step and go, it may be possible togain some of the capabilities called for above.

[1484] Integrating the Handel-C Simulator with SingleStep

[1485] SingleStep provides two different API's for interfacing externallogic simulators to the CPU processors, the Peripheral API and theMemory Modeling Kit.

[1486] Memory Modeling Kit

[1487] The MMK API is an optional extra (presumably costing more). Itenables the user to replace the entire memory with their ownimplementation. The interface is clock cycle accurate. Further, theSingleStep debugger calls a MMK library function such as mmk-access toaccess memory. The MMK library function returns a value indicating howmany clock cycles the call took, and how successful it was.

[1488] For slow memory, the mmk-access function may either return thetotal number of clock cycles required, or it can return a smaller numberof clock cycles, such as 1. Further, it may indicate that the memoryaccess isn't over. SingleStep may then call mmk-access repeatedly untilthe memory access completes. The mmk-access function may also returnindicating that no clock cycles have passed. This can be used to allowthe SingleStep debugger to respond to user interaction and updatewindows. The MMK library may model just the memory logic external to theCPU. In the alternative, it may also include some of the MMU, the Cacheand the Bus Interface Unit if the user wishes to use their ownimplementation instead of the SingleStep implementation or if SingleStepdoesn't provide an implementation of these for a particular processor.

[1489] The MMK API provides call back functions enabling the MMK libraryto create windows and add menu items. The MMK Library may update datastructures and use call-backs to inform SingleStep when the interruptstatus has changed. This enables clock-cycle accurate simulation ofinterrupts.

[1490] Peripheral API

[1491] The Peripheral API enables the user to integrate their externallogic with SingleSteps' own memory implementation. SingleSteps' ownmemory implementation allows one to specify which types of memory arewhere in the memory map. Also included are details such as RAM/ROM/WOM,access speed, burst read capabilities, whether the memory should becached, and what mode the processor should be in to be able to accessthe data: user/supervisor, instruction fetch/data fetch.

[1492] The Peripheral API functions are only called betweeninstructions. The functions are told which clock cycles they have beencalled on, so clock cycle accurate synchronization is still possible.Not being able to stop SingleStep on arbitrary clock cycles would limitthe users interaction with the debugger. It seems unlikely that theSingleStep GUI may be responsive while a peripheral library is blocking,(given that the MMK API provides an explicit means to allow the GUI tobe responsive during long peripheral simulation and the Peripheral APIdoesn't).

[1493] The Peripheral API provides call back functions, enabling thePeripheral library to create windows. The Peripheral Library may updatedata structures to inform SingleStep that the status of interrupts haschanged. The library cannot indicate when the change occurred, sosimulation of mid-instruction interrupts may not be possible.

[1494] General Comparison

[1495] The Peripheral API is implemented using some C++ class interfaceapparently inspired by COM. It may be quite easy to use once a dummylibrary library has been implemented. The ease with which the PeripheralAPI allows one to combine SingleStep implemented memory with peripheralsmakes the Peripheral API good for prototyping and experimenting witharchitectures before one is committed to one. The MMK interface wouldrequire either changing the library code each time the memory modelchanged, or essentially a reimplementaiton of SingleSteps own memorymodel within a MMK library.

[1496] The MMK API is quite straight forward to use. It provides bothgeneric memory access routines and specialized ones. The user mayimplement at least one generic access function and any of thespecialized ones they wish. SingleStep may automatically decide whetherto use a generic or specialized access function. There are limits onwhere memory can sensibly be implemented. The SingleStep debugger needsto be able to access the contents of memory to updatewindows/disassemble machine code. If the implementation of the memoryisn't aware of the debuggers needs, the debuggers may not be able tobrowse memory contents. This makes implementing memory in anothersimulator undesirable.

[1497] Command Line Compiler

[1498] Environment Variables

[1499] The Handel-C compiler has three environment variables associatedwith it. HANDELC_SIM_COMPILE is an alternative to the -cl command lineoption. It is used to create the simulation file when compiling usingthe command line. HANDELC_LIBPATH is the search path for libraries. Thevalue of HANDELC_CPPFLAGS is passed as command line options to thepreprocessor each time the compiler is executed.

[1500] The Handel-C installation sets the HANDELC_CPPFLAGS variable tocontain the -C option and to add the include directory to the searchpath for the preprocessor. The -C option passes source code commentsthrough to the compiler.

[1501] One is free to change the value of the HANDELC_CPPFLAGS and theHANDELC_LIBPATH to whatever he or she requires. To change theenvironment variables use the facilities described in the installationinstructions.

[1502] Temporarily Changing the Environment Variables

[1503] One can temporarily alter the value of the variable by typing thefollowing at the DOS prompt (Windows 98) or the command prompt (WindowsNT):

[1504] set HANDELC_CPPFLAGS=Command Line Options

[1505] For example:

[1506] set HANDELC_CPPFLAGS=-C -Iinclude -DDEBUG.

[1507] Summary of Command Line Options

[1508] This present section details all the command line options of theHandel-C compiler and stand-alone simulator. FIGS. 45A and 45B summarizethe options 4500 available on the compiler.

[1509] Target Options

[1510] The Handel-C compiler can target the simulator or hardware. Onlyone target option (-s, -vhdl or -edif) may be specified on the commandline.

[1511] Target Simulator

[1512] To target the simulator, use the -s option on the compilercommand line. handelc -s file.c To enable debugging, use the -g option.handelc -s -g file.c

[1513] Optimiser Options

[1514] The -O option enables all optimizations. For example, to compilethe program prog.c with all optimizations, one could type:

[1515] handelc -s -O prog.c

[1516] Enabling all optimizations may substantially add to compilationtime. If no optimizer command line options are specified then someoptimizations are disabled to reduce compilation times at the expense ofa few additional gates in the netlist.

[1517] Debugging Options

[1518] Various options are provided to aid with debugging Handel-Cprograms.

[1519] Turning On All Warnings

[1520] The -W option tells the compiler to display all warnings duringcompilation. By default, some less interesting warnings may be disabledand may not be displayed by the compiler.

[1521] Estimating Logic Area and Depth

[1522] The Handel-C compiler -e option gives feedback on logic usage anddepth to help with optimizing the Handel-C designs. The feedbackconsists of an HTML file for the project and an HTML file for eachsource file in the project. These highlight parts of the source codewith colors which relate to the logic depth and usage. These estimatesare provided as a guide only since full place and route is needed to getexact logic area and timing information. Nevertheless, they provide avaluable starting point for optimization. To generate an HTML file, usethe -e option. For example: handelc -e test.c

[1523] This may generate two files test.html (summarizing the project)and test_c.html (estimating logic usage) which can be loaded into a Webbrowser such as Internet Explorer or Netscape. The project file links tothe other html files of highlighted source code, and to the lines withthe highest area or delays. The source code estimation is in two parts:estimates of logic area and estimates of logic delay (i.e. logic depth).The code is colored from blue (low) through yellow to red (high) toindicate area or delay. Optimization should concentrate on red areasfirst.

[1524] Compilation Control Options

[1525] Two options are provided to control compilation.

[1526] Pass Options to Preprocessor

[1527] The -cpp option can be used to pass options to the preprocessor.For example, to add the directory include to the search path, one couldtype:

[1528] handelc -s -cpp -Iinclude prog.c

[1529] -I, -D and -U can be used directly and do not have to be passedto the preprocessor with -cpp . . . 13 Example programs.

[1530] Introduction

[1531] This section details the basic example programs supplied with theHandel-C compiler and describes how to compile and simulate them.

BASIC EXAMPLES Example 1 Simple Accumulator Example

[1532] Shows the use of file input and output in simulation.

Example 2 Pipelined Multiplier Example

[1533] Shows the use of a replicator.

Example 3 Queue Example

[1534] Shows the use of multiple mains in different files and how totake advantage of this Handel-C feature to test programs.

Example 4 Clients/Server Example

[1535] Shows the use of prialt, mpram, arrays of functions and separateclocks.

Example 5 Preprocessor Example

[1536] Builds a program which calculates Fibonacci numbers.

Example C Edge Detector Example

[1537] This is a series of programs showing how to port conventional Croutines to Handel-C. Each of the programs is in a separate projectwithin a single workspace.

[1538] Files Required for the Examples

[1539] The example project settings have been set up to reference thestandard macro library (stdlib.lib) and its associated header file. Ifone moves the project or use the files in a different project, he or shemay need to have the following project settings.

[1540] Preprocessor

[1541] Set the pathname handel-c root pathname\include in the Additionalinclude directories pane

[1542] Linker

[1543] Add stdlib.lib to the Object/library modules pane

[1544] Set the pathname handel-c root pathname\Lib in the Additionallibrary path.

Example 1 The Accumulator Example

[1545] This program takes a number of values from a file and calculatesthe sum of those values. It illustrates the basics of producing aHandel-C program and demonstrates the use of the simulator.

[1546] Compiling and Simulating the Program

[1547] Open the workspace file(HandelC\Examples\Handel-C\Example1\Example1.hw) by double-clicking onit. Handel-C may start with the Example 1 workspace open. Check thatfile view is the current view, and click on the +sign to the left of thechip icon to see what files are within the project. If one wishes toexamine the code, double-click the file sum.c in the workspace pane. Ifone cannot see it, he or she can make the workspace pane larger bydragging its border, or make the space allocated to filenames larger bydragging the border of the Object button.

[1548] Build the project, by selecting Build Example1 from the Buildmenu. Messages from the compiler may appear in the output window. Theymay give an approximation of the number of hardware gates required toimplement the program.

[1549] One can then start the debugger and simulator by typing F11 (tostep through it) or F5 to run to the end. The simulator then startsimmediately and reads the contents of values from the file sum_in.dat,sums them, and writes the result to the file sum_out.dat. One can watchthe accumulation progressing in the variable sum by opening a Watchwindow (select View>Debug Windows>Watch or type Alt 3) and typing sum inthe window. The simulator may not terminate at the end of the program.To stop simulation, go to Debug>Stop Debugging. Examine the files toensure that the output file contains the correct result. If one wishesto change the values in sum_in, ensure that each value is placed on aline of its own.

Example 2 The Pipelined Multiplier Example

[1550] This program performs multiplication using a replicated parallelstructure to create a pipeline.

[1551] The operands used are the initialization values to the arrays ofleftOps and rightOps, such that the result[n] =leftops[n] * rightOps[n].

[1552] This multiplier calculates the 16 LSBs of the result of a 16 bitby 16 bit multiply using long multiplication. The multiplier producesone result per clock cycle with a latency of 16 clock cycles. This meansthat although any one result takes 16 clock cycles, one gets athroughput of 1 multiply per clock cycle. Since each pipeline stage isvery simple, combinatorial logic is shallow and a much higher clock rateis achieved than would be possible with a complete single cyclemultiplier.

[1553] At each clock cycle, partial results pass through each stage ofthe multiplier in the sum array. Each stage adds on 2 n multiplied bythe b operand if required. The LSB of the a operand at each stage tellsthe multiply stage whether to add this value or not.

[1554] Operands are fed in on every clock cycle on signals leftOp andrightOp. Results appear 16 clock cycles later on every clock cycle onsignal result. Code details /* * Index at end of array macro */ #defineIndexAtArrayEnd (Index, ArrayLimit) \ select (exp2(width(Index)) ==(ArrayLimit), ! (Index), ((Index) == \ (ArrayLimit)))

[1555] The IndexAtArrayEnd macro tests if the index of size ArrayLimitis at the end of an array, whatever width the index counter has beenassigned by the compiler. In most cases, this is a normal comparison,but if the index overflows, the test may compare the overflow value. Anexample is an index of size 4. The compiler may assign the index a widthof 2 bits (to store the values 0-3). When it is compared against 4, itthe index may hold the value 0 (as the most significant bit has beenlost). In this case, the IndexAtArrayEnd macro compares against 0instead of against 4. This implies that such a comparison cannot be madeat the start of the cycle, when element zero is being processed, butonly at the end of the cycle after the index has been incremented.

[1556] Compiling and Simulating the Program

[1557] One can compile and simulate the program by opening the workspacein the examples\Handel-C\Example2 directory and selecting Build Example2from the Build menu. A person can then start the debugger.

Example 3 The Queue Example

[1558] The program is in three files: queue.c handles the queuefunction, while main.c provides I/O facilities. Definitions common toboth files are given in queue.h. They both have a clock set (in thiscase, the same clock source is used for both functions).

[1559] The queue function code illustrates the use of parallel tasks andchannel communications by implementing a simple four place queue. Eachtask holds one piece of data and has an input channel connected to theprevious queue location and an output channel connected to the nextqueue location.

[1560] At each iteration, the data moves one place up the queue. Theprogram executes an infinite loop, and one may use Stop Debugger toterminate the simulation.

[1561] Detailed Explanation

[1562] This example uses four parallel tasks each containing one word ofdata. At each iteration, one word is passed from one task to another ina chain. The links between the processes are entries in the links arrayof channels while the input and output to and from the system is handledby the main function.

[1563] Communication between the two functions is handled by an array ofchannels. The queue only reads data and writes data on every other clockcycle. A replicated pipeline is used to implement the queue. The firstand last entries in the pipeline are treated differently by using aselect statement to differentiate them at compile time. To watch thequeue in the debugger, start the debugger, and add the queue variablesto the watch window, (state etc.) If one adds an array name to the watchwindow, a + sign appears. Click on the + to get a list of the arrayelements.

[1564] Summary

[1565] This example has shown how to create parallel tasks and how tocommunicate between those tasks. It has also illustrated arrays ofvariables and arrays of channels. The example shows a project containingindependent main functions which are implemented independently inhardware.

[1566] Also, the queue presented here is parameterized on the width ofthe input and output channels because the width of all internalvariables are undefined and inferred by the compiler.

[1567] Running the Example

[1568] Double-click on the workspace file Example3.hw in theexamples/Handel-C/Example3 directory.

[1569] Compile and build it by selecting Build>Build Example3.

[1570] Step into the program within the debugger by pressing F11.

[1571] One may be asked to select a clock for the debugger to use. Inthis case they are both identical. Select one and click OK.

[1572] View local variables by selecting View>Debug Windows>Variables(or press Alt 4) and select the Locals tab.

[1573] The variables local to the function may be visible in the Debugwindow.

[1574] One can watch the values change as he or she steps through thecode (repeatedly press F11).

Example 4 The Client/Server Example

[1575] The clients and server are implemented as independent pieces ofhardware, communicating via channels. The server reads data from anarray of channels from the client and puts the results in a queue asthey arrive. They are read from the queue by a dummy service routine.This is where the client requests could be processed by a real serverroutine. The server clock runs at half the speed of the client clock toallow time for complex assignments during request processing. There is apair of identical client functions. These functions merely select validrequests from an array and send them to the server.

[1576] Code Details

[1577] The internal queue is implemented in a structure consisting oftwo counters (queueIn and queueOut) which are used to test how full thequeue is, and an mpram containing the queued data. Use of an mpramallows the queue to be written to and read from in the same clock cycle.typedef struct { unsigned int queueIn; unsigned int queueOut; mpram {wom int DataWidth in [MaxQueue]; rom int DataWidth out [MaxQueue]; }values; } Queue;

[1578] Running the Example

[1579] Double-click on the workspace file Example4.hw in theexamples/Handel-C/Example4 directory.

[1580] Compile and build it by selecting Build>Build Example4.

[1581] Step into the program within the debugger by pressing F11

Example 5 The Microprocessor Example

[1582] In this example, Handel-C implements a simple microprocessor.This microprocessor executes a program stored in ROM to calculatemembers of the Fibonacci number sequence.

[1583] Compiling and Simulating the Program

[1584] Compile and link the program by opening the workspace in theexamples\Handel-C\Example5 directory and then building the project.Simulate the program by starting the debugger (press F11 tosingle-step).

[1585] Detailed Explanation

[1586] The system described in this example consists of a ROM containingthe program to execute, a RAM containing some scratch variables and aprocessor that understands 10 opcodes. Each instruction is made up of a4 bit opcode and a 4 bit operand. The _asm_ preprocessor macro is theassembler for this language and is used to fill in the entries in theprogram ROM declaration.

[1587] The processor has three registers:

[1588] a program counter, pc, that points to the next instruction to befetched from the ROM

[1589] an instruction register, ir, containing the instruction beingexecuted

[1590] an accumulator register, x, used as one input to the ‘ALU’

[1591] The instructions that the processor can execute are:

[1592] Opcode Description Opcode Description HALT Stop processing LOADLoad a value from RAM into x LOADI Load a constant into x STORE Store xto RAM ADD Add a value from RAM to x SUB Subtract a value from RAM fromx JUMP Unconditional jump to a ROM location JUMPNZ Jump to a ROMlocation if x is not 0 INPUT Read a value into x OUTPUT Write x to user

[1593] Using these instructions, a ROM is built containing a program togenerate the Fibonacci numbers. The execution unit of the processorsimply fetches instructions from the program ROM and executes them usinga switch statement. While it may appear to be a simple example it shouldbe easy to see how this example could be extended to implement a morecomplex processor. What has been produced is a processor which containsthe instructions necessary to calculate Fibonacci numbers. It is equallypossible to produce processors which contain specialized instructionsfor any application. Thus, one could use Handel-C to develop processorscapable of executing programs for specialized applications with theminimum of effort. FIGS. 46A and 46B illustrate various commands anddebugs 4600, in accordance with one embodiment of the present invention.

[1594]FIGS. 47A through 47C illustrate various icons 4700 that may beutilized, in accordance with one embodiment of the present invention.

[1595] Utilities

[1596] Introduction

[1597] The Handel-C compiler package also contains the followingutilities.

[1598] bmp2raw converts BMP image files to a format suitable for inputto the Handel-C simulator. raw2bmp generates BMP image files from a filegenerated by the Handel-C simulator.

[1599] The edge detector example requires an image as its source andgenerates an image as its results. The bmp2raw utility and its partnerraw2bmp are provided with the Handel-C compiler to perform conversionsbetween BMP image files and the file format suitable for the Handel-Csimulator. They are not restricted for use with the edge detectorexample and may be used for converting files for the image processingapplications.

[1600] These utilities can handle both raw binary and text file formats.This is useful if, as with the edge detector, a conventional C programrequires raw binary input and output whereas the simulator requires textinput and output. The raw data format can be configured to have thecolor bits in any order to allow simulation of applications requiringnon-standard bit patterns (e.g. 5-6-5 bit RGB format).

[1601] The bmp2raw Utility

[1602] The general usage of the bmp2raw utility is as follows:

[1603] bmp2raw [-b] BMPFile RA WFile RGBFile

[1604] Here BMPFile is the source image file, RA WFile is thedestination raw data file and RGBFile is a file describing the format ofthe pixels in the raw data file.

[1605] Adding the -b flag as the first command line option causes theutility to generate a raw binary file rather than a text file. To seethe difference, consider a file containing the numbers 0 to 3. The textversion (no -b option) would look like this:

[1606] 0x01

[1607] 0x02

[1608] 0x03

[1609] The binary version (created with -b option) would not be visiblewhen loaded into an editor. Instead, a hex dump of the file might looklike this: 00000000 00 01 02 03 ** ** ** ** . . . **** The format of theraw data file can be controlled with the RGBFile specified on thecommand line. This tells the utility where to place each color bit inthe words in the raw data file. Internally, the pixels in the BMP fileare expanded to 8 bits for each of red, green and blue. The RGBdescription file has the general format: Red Location for bit 7 of redLocation for bit 6 of red Location for bit 5 of red Location for bit 4of red Location for bit 3 of red Location for bit 2 of red Location forbit 1 of red Location for bit 0 of red Green Location for bit 7 of greenLocation for bit 6 of green Location for bit 5 of green Location for bit4 of green Location for bit 3 of green Location for bit 2 of greenLocation for bit 1 of green Location for bit 0 of green Blue Locationfor bit 7 of blue Location for bit 6 of blue Location for bit 5 of blueLocation for bit 4 of blue Location for bit 3 of blue Location for bit 2of blue Location for bit 1 of blue Location for bit 0 of blue

[1610] The file works by starting counting at bit 7 of the colorspecified by the identifier word and works down through the bits of thatcolor placing each bit in the specified location in the destinationword. The destination word may automatically be created wide enough tocontain the most significant bit specified (up to 32 bits wide intotal).

[1611] One need not specify 8 locations for each color. The leastsignificant bits of each color may be dropped if fewer than 8 locationsare specified. In the example below, the least significant 6 bits of redand blue and the least significant 4 bits of green are dropped. FIG. 48illustrates the various raw file bit numbers and the corresponding colorbits 4800.

[1612] Such values use the following RGBFile:

[1613] Red

[1614]7

[1615]2

[1616] Green

[1617]6

[1618]3

[1619]1

[1620]0

[1621] Blue

[1622]5

[1623]4

[1624] Each pixel number and identifier (Red, Green or Blue) may appearon a separate line. One may also specify multiple identifiers of thesame color. The bit counter may continue to count down from the valuereached for that color each time one specifies the color again. Forexample, the above file could also be written as follows:

[1625] Red

[1626]7

[1627] Green

[1628]6

[1629] Blue

[1630]5

[1631] Red

[1632]2

[1633] Green

[1634]3

[1635]1

[1636] Blue

[1637]1

[1638] Green

[1639]0

[1640] RGBFile Example

[1641] There is an example file provided with the utilities to perform acommon conversion.

[1642] 8BPPdest.rgb Extracts red component from source image andgenerates 8 bit per pixel raw image. Useful for greyscale images

[1643] The raw2bmp Utility

[1644] The raw2bmp utility is the reverse of the bmp2raw utility. Itconverts raw text or binary files to BMP image files. The main use ofthe raw2bmp utility is to allow viewing of the output from imageprocessing applications with the standard Windows 98 or NT Paintutilities.

[1645] The general usage of the raw2bmp utility is as follows:

[1646] raw2bmp [-b] Width RA WFile BMPFile RGBFile.

[1647] Width the width of the image (the height may be calculated fromthis parameter and the source file length).

[1648] RA WFile source file containing raw data.

[1649] BMPFile destination image file.

[1650] RGBFile file describing the format of the pixels in the raw datafile.

[1651] Adding the -b flag as the first command line option causes theutility to read a raw binary file rather than a text file.

[1652] The format of the RGBFile describing where each bit is located inthe raw data word is similar to the file used by the bmp2raw utility.Indeed, for some pixel formats (including the example presented in theprevious section) a common file may be used. As an example of where adifferent file may be required, consider the conversion of 8 bit perpixel greyscale images to a BMP image. Here, each bit may be duplicatedin the red, green and blue components of the destination BMP file. Forexample:

[1653] red

[1654]7

[1655]6

[1656]5

[1657]4

[1658]3

[1659]2

[1660]1

[1661]0

[1662] green

[1663]7

[1664]6

[1665]5

[1666]4

[1667]3

[1668]2

[1669]1

[1670]0

[1671] blue

[1672]7

[1673]6

[1674]5

[1675]4

[1676]3

[1677]2

[1678]1

[1679]0

[1680] RGBFile Example

[1681] An example file is provided with the utility:

[1682] 8BPPsrc.rgb Duplicates each bit of 8 bit per pixel raw file tored, green and blue components. Useful for greyscale images.

[1683] Error Messages.

[1684] Introduction

[1685] Most error messages should be obvious. Some of the less obviousones may be due to system problems, such as files being corrupted,unavailable or in the wrong fornat, or the system not having enough diskspace to write to a file Error messages that do not fall into thesecategories are listed below with a brief explanation.

[1686] Handel-C Environment

[1687] “Handel-C cannot continue with Find in Files. Details:”

[1688] File could not be open or read

[1689] “Handel-C could not insert the project file in to the workspace.Details:”

[1690] File could not be open or read

[1691] “Handel-C could not load the browse-info database file”

[1692] File could not be open or read

[1693] “Handel-C could not start the simulator. Details:”

[1694] File could not be open or read

[1695] “None of the simulator DLLs have any clocks defined.”

[1696] One has no main programs associated with clocks in the compiledcode.

[1697] “The simulator ‘NN’ does not have any clocks defined.”

[1698] One has built a function with no clock and attempted to simulateit. One should have a clocked main function that interfaces to theunclocked function.

[1699] “The symbol ‘NN’ is not defined.”

[1700] The cursor is not on a known symbol or a symbol has not beenselected in the file

[1701] “There is no browse information for the project NN.”

[1702] One did not have generate browse information selected when he orshe compiled the file

[1703] Compiler Error Messages

[1704] “Attempt to access partial struct/union ‘NN’”

[1705] Struct or union not fully defined. E.g. struct S; S x; x.Bill;without the definition. struct S { int Bill;

[1706] “Cannot compile object—not all information is known”

[1707] Could not infer a width or type etc. E.g. int undefined x;

[1708] “Cannot target EDIF—not all information is known”

[1709] Could not infer a width or type etc. E.g. int undefined x;

[1710] “Cannot target RTL level VHDL—not all information is known”

[1711] Could not infer a width or type etc. E.g. int undefined x;

[1712] “Cannot target simulator—not all information is known”

[1713] Could not infer a width or type etc. E.g. int undefined x;

[1714] “Could not determine which clock to use for ‘%s’.

[1715] An object requiring a clock was built but the compiler couldn'twork out which clock it should be connected to. Probably caused by anunused object (the compiler finds clocks from an object's use and notits declaration).

[1716] “Could not infer information about this object”

[1717] Could not infer a width or type etc. E.g. int undefined x;

[1718] “Design contains an unbreakable combinational cycle”

[1719] Compiler could not break a combinatorial code loop.

[1720] “Error while compiling simulation output (%s)”

[1721] The back end simulation compiler (e.g. VC++) failed to compilethe simulation output. (E.g. not enough disk space, could not find file,illegal option specified in -cl, internal compiler error etc.).

[1722] “External tool not found (preprocessor or backend C compiler notin path)”

[1723] Error when the compile cannot run the C preprocessor or the Ccompiler used to compile the simulation .dll.

[1724] “Illegal use of identifier ‘%s’”

[1725] Probably caused by using a typedef name as a variable.

[1726] “Memory forms do not match”

[1727] Caused by comparing two types of memory (e.g. one is ram int x[1]and the other is rom int y[1]

[1728] “Syntax error”

[1729] Syntax error in source code

[1730] “Variable ‘%s’ is used from more than one clock domain”

[1731] Data may be passed to different clock domains using a channel oran interface. Variables cannot be shared between clock domains

[1732] Simulator Error Messages

[1733] Illegal Base Specification

[1734] base specification not 2, 8, 10 or 16

[1735] Invalid Input File

[1736] infile in wrong format

[1737] The simulator also forwards errors from plugins that have beenwritten using the API.

HANDEL-C Language

[1738] This section deals with some of the basics behind the Handel-Clanguage. Handel-C uses the syntax of conventional C with the additionof inherent parallelism. One can write sequential programs in Handel-C,but to gain maximum benefit in performance from the target hardware onemay use its parallel constructs. These may be new to some users.

[1739] If one is familiar with conventional C he or she may recognizenearly all the other features. Handel-C is designed to allow one toexpress the algorithm without worrying about how the underlyingcomputation engine works. This philosophy makes Handel-C a programminglanguage rather than a hardware description language. In some senses,Handel-C is to hardware what a conventional high-level language is tomicroprocessor assembly language.

[1740] It is important to note that the hardware design that Handel-Cproduces is generated directly from the source program. There is nointermediate ‘interpreting’ layer as exists in assembly language whentargeting general purpose microprocessors. The logic gates that make upthe final Handel-C circuit are the assembly instructions of the Handel-Csystem.

[1741] Handel-C Programs

[1742] Since Handel-C is based on the syntax of conventional C, programswritten in Handel-C are implicitly sequential. Writing one command afteranother indicates that those instructions should be executed in thatexact order.

[1743] Just like any other conventional language, Handel-C providesconstructs to control the flow of a program. For example, code can beexecuted conditionally depending on the value of some expression, or ablock of code can be repeated a number of times using a loop construct.

[1744] Parallel Programs

[1745] Because the target of the Handel-C compiler is low-levelhardware, massive performance benefits are made possible by the use ofparallelism. It is possible (and indeed essential for writing efficientprograms) to instruct the compiler to build hardware to executestatements in parallel. Handel-C parallelism is true parallelism—it isnot the time-sliced parallelism familiar from general purpose computers.

[1746] When instructed to execute two instructions in parallel, thosetwo instructions may be executed at exactly the same instant in time bytwo separate pieces of hardware.

[1747] When a parallel block is encountered, execution flow splits atthe start of the parallel block and each branch of the block executessimultaneously. Execution flow then re-joins at the end of the blockwhen all branches have completed. FIG. 49 illustrates the manner 4900 inwhich branches that complete early are forced to wait for the slowestbranch before continuing.

[1748]FIG. 49 illustrates the branching and re-joining of the executionflow. The left hand branch 4902 and middle branch 4904 may wait toensure that all branches have completed before the instruction followingthe parallel construct can be executed.

[1749] Channel Communications

[1750]FIG. 50 illustrates the link 5000 between parallel branches, inaccordance with one embodiment of the present invention. Channels 5001provide a link between parallel branches. One parallel branch 5002outputs data onto the channel and the other branch 5004 reads data fromthe channel. Channels also provide synchronization between parallelbranches because the data transfer can only complete when both partiesare ready for it. If the transmitter is not ready for the communicationthen the receiver may wait for it to become ready and vice versa.

[1751] Here, the channel is shown transferring data from the left branchto the right branch. If the left branch reaches point a before the rightbranch reaches point b, the left branch waits at point a until the rightbranch reaches point b.

[1752] Scope and Variable Sharing

[1753]FIG. 51 illustrates the scope 5100 of variables, in accordancewith one embodiment of the present invention. The scope of declarationsis, as in conventional C, based around code blocks. A code block isdenoted with {. . . } brackets. This means that:

[1754] Global variables may be declared outside all code blocks.

[1755] An identifier is in scope within a code block and any sub-blocksof that block.

[1756] Since parallel constructs are simply code blocks, variables canbe in scope in two parallel branches of code. This can lead to resourceconflicts if the variable is written to simultaneously by more than oneof the branches. Handel-C syntax states that a single variable may notbe written to by more than one parallel branch but may be read from byseveral parallel branches. This provides some powerful operations to bedescribed later.

[1757] If one wishes to write to the same variable from severalprocesses, the correct way to do so is by using channels which are readfrom in a single process. This process can use a prialt statement toselect which channel is ready to be read from first, and that channel isthe only one which may be allowed to write to the variable while (1)prialt { case chan1 ? y: break; case chan2 ? y: break; case chan3 ? y:break; }

[1758] In this case, three separate processes can attempt to change thevalue of y by sending data down the channels, chan1, chan2 and chan3. ymay be changed by whichever process sends the data first. A singlevariable should not be written to by more than one parallel branch.

[1759] Alternate Embodiments

[1760] Introduction

[1761] This section summarizes the new features in Handel-C version 3for those familiar with previous versions. It also detailsincompatibilities between the current version and Handel-C version 2.1.The following constructs have been added or changed. Terms specific toHandel-C have been given in bold. All other terms are fully compatiblewith ISO-C (ISO/IEC 9899:1999) unless otherwise stated. (ISO-C waspreviously known as ANSI-C.)

[1762] Operator Meaning ISO-C Change in Version 3

[1763]FIGS. 52, 53 and 54 illustrate a table of operators, statements,and macros respectively, along with alternate meanings thereof.

[1764] Linker Changes

[1765] Multiple files can be linked together and loaded into a singleFPGA. This allows one to create and access library files. One can load asingle chip with multiple main functions. This means that one can haveindependent logic blocks using different clocks running within the sameFPGA. The clock can be internal or external. External clocks may be userspecified.

[1766] Language Changes

[1767] ISO-C Compatible Extensions:

[1768] Compatibility with ISO standard C has been increased, so moststandard types and derived types are supported. This includes pointersand structures but does not include floats; goto, continue and returnare supported. (Note that one cannot use goto, continue or break toenter or exit from a par statement.) Handel-C now supports functions.These can be used instead of macros.

[1769] Functions can be immediately expanded using the inline keyword.To support the multiple files system, prototypes are supported, as arethe ISO-C keywords, extern and static. One can send messages to thestandard error channel using the assert directive.

[1770] Macro Changes

[1771] One can now declare local variables inside a macro expression.There is a new directive, ifselect, which permits conditionalcompilation according to the result of a test at compile time.

[1772] Statements

[1773] The Handel-C language has been extended to allow code to bereplicated using a construct similar to a for loop. This means that onecan generate multiple identical copies of the same block of code, eitherin sequence or in parallel.

[1774] Architecture

[1775] There is a new type to represent signals. One can havemulti-dimensional arrays of RAMs and dual-ported RAMs. Interfaces havebeen extended to allow one to connect to undefined input or outputports. One can also define the sorts of interface and use them to linkto blocks of external code (currently VHDL or EDIF). Interfacesdeclarations have changed, and the previous style is deprecated.

[1776] Pins no longer need to be assigned. One can omit the dataspecification to leave the pin assignment unconstrained. In this case,the place and route tools may assign the pins. A person can havemultiple clocks within a system, and refer to the current clock byusing_(——)clock.

[1777] Compiler Changes

[1778]FIG. 55 illustrates a system 5500 including a compiler 5501, inaccordance with one embodiment of the present invention. The newcompiler has a linker 5502, allowing one to have multiple input files5504 and links to library files. Multiple files can now be linked into asingle output module. These files can be pre-compiled core modules,libraries, header files, or pieces of VHDL code. The extern keywordallows one to reference a function or variable in another file.

[1779] Linking is carried out during a build.

[1780] Incompatibilities with Version 2.1

[1781] Symbol Scoping Rules

[1782] The rules for scoping for macro expr and macro proc constructshave changed between version 2.1 and 3.0. Version 2.1 expands macros inthe scope of their use. Version 3.0 expands macros in the scope of theirdeclaration. This is consistent with C scoping rules. For example: intx; // Version 3.0 may use this x macro expr a = x; void main (void) {int x; // Version 2.1 may use this x y = a; }

[1783] This may lead to undeclared identifier errors. For example, thefollowing code is valid in version 2.1 but not in version 3.0: macroproc a (x) { b (x); } macro proc b (y) { y++; } void main (void) { int 4z; a (z); }

[1784] Using Macro Expressions in Widths

[1785] Version 3.0 requires disambiguating brackets around macroexpressions used in variable widths. For example:

[1786] int log2ceil(64) x;

[1787] may be rewritten as:

[1788] int (log2ceil(64)) x;

[1789] New Keywords Clashing with Variable Names

[1790] Version 3.0 contains a number of new keywords which may clashwith variable names in version 2.1 code.

[1791] The list of new keywords is: assert    auto  const    continue  double enum  extern    float    goto  ifselect in   inline    let mpram    register return    seq  signal   sizeof     static. Struct   typedef   typeof   union     volatile  wom

[1792] Additional Combinational Loops

[1793] Version 2.1 uses approximations when checking for combinatorialloops in the generated logic. Version 3.0 does not use suchapproximations and may report unbreakable combinational loops inprograms which compile with version 2.1.

[1794] Clock is Required for Simulation

[1795] Version 3.0 requires that a clock is specified when generatingsimulation output. A dummy clock such as ‘set clock = external “P1”;’ isvalid.

[1796] Language Basics

[1797] Introduction

[1798] This section of the present description deals with the basics ofproducing Handel-C programs

[1799] Program Structure

[1800] Sequential Structure

[1801] As in a conventional C program, a Handel-C program consists of aseries of statements which execute sequentially. These statements arecontained within a main( ) function that tells the compiler where theprogram begins. The body of the main function may be split into a numberof blocks using {. . . } brackets to break the program into readablechunks and restrict the scope of variables and identifiers.

[1802] Handel-C also has functions, variables and expressions similar toconventional C. There are restrictions where operations are notappropriate to hardware implementation and extensions where hardwareimplementation allows additional functionality.

[1803] Parallel Structure

[1804] Unlike conventional C, Handel-C programs can also have statementsor functions that execute in parallel. This feature is crucial whentargeting hardware because parallelism is the main way to increaseperformance by using hardware. Parallel processes can communicate usingchannels. A channel is a one-way point-to-point link between twoprocesses.

[1805] Overall Structure

[1806] The overall program structure consists of one or more mainfunctions, each associated with a clock. One would only use more thanone main function if he or she needed parts of the program to run atdifferent speeds (and so use different clocks). A main function isdefined as follows: Global Declarations Clock Definition void main(void) { Local Declarations Body Code }

[1807] The main( ) function takes no arguments and returns no value.This is in line with a hardware implementation where there are nocommand line arguments and no environment to return values to. The argc,argv and envp parameters and the return value familiar from conventionalC can be replaced with explicit communications with an external system(e.g. a host microprocessor) within the body of the program.

[1808] Using the Preprocessor

[1809] As with conventional C, the Handel-C source code is passedthrough a C preprocessor before compilation. Therefore, the usual#include and #define constructs may be used to perform textualmanipulation on the source code before compilation.

[1810] Handel-C also supports macros that are more powerful than thosehandled by the preprocessor.

[1811] Comments

[1812] Handel-C uses the standard /* . . . */ delimiters for comments.These comments may not be nested. For example:

[1813] /* Valid comment */

[1814] /* This is /* NOT */ valid */

[1815] Handel-C also provides the C++ style // comment marker whichtells the compiler to ignore everything up to the next newline. Forexample:

[1816] x = x + 1; This is a comment

[1817] Comments are handled by the preprocessor.

[1818] Declarations.

[1819] Introduction

[1820] This section of the present description details the types ofdeclarations that can be made and the way that the type system inHandel-C differs from that of conventional C.

[1821] Handel-C Values and Widths

[1822] A crucial difference between Handel-C and conventional C isHandel-C's ability to handle values of arbitrary width. Sinceconventional C is targeted at general purpose microprocessors it handles8, 16 and 32 bit values well but cannot easily handle other widths. Whentargeting hardware, there is no reason to be tied to these data widthsand so Handel-C has been extended to allow types of any number of bits.Handel-C has also been extended to cope with extracting bits from valuesand joining values together to form wider values. These operationsrequire no hardware and can provide great performance improvements oversoftware.

[1823] When writing programs in Handel-C, care should be taken that datapaths are no wider than necessary to minimize hardware usage. While itmay be valid to use 32-bit values for all items, a large amount ofunnecessary hardware is produced if none of these values exceed 4 bits.Care may also be taken that values do not overflow their width. This ismore of an issue with Handel-C than with conventional C becausevariables should be just wide enough to contain the largest valuerequired (and no wider).

[1824] Constants

[1825] Constants may be used in expressions. Decimal constants arewritten as simply the number while hexadecimal constants may be prefixedwith 0x or 0X, octal constants may be prefixed with a zero and binaryconstants may be prefixed with 0b or 0B. For example:

[1826] w = 1234; /* Decimal */

[1827] x = 0x1234; /* Hexadecimal */

[1828] y = 01234; /* Octal */

[1829] z = 0b00100110; /* Binary */

[1830] The width of a constant may be explicitly given by ‘casting’. Forexample:

[1831] x=(unsigned int 3) 1;

[1832] Casting may be necessary where the compiler is unable to inferthe width of the constant from its usage.

[1833] Types

[1834] Handel-C uses two kinds of objects: logic types and architecturetypes. The logic types specify variables. The architecture types specifyvariables that require a particular sort of hardware architecture (e.g.,ROMs, RAMs and channels). Both kinds are specified by their scope(static or extern), their size and their type. Architectural types arealso specified by the logic type that uses them.

[1835] Both types can be used in derived types (such as structures,arrays or functions) but there may be some restrictions on the use ofarchitectural types.

[1836] Specifiers

[1837] The type specifiers signed, unsigned and undefined define whetherthe variable is signed and whether it takes a default defined width. Onecan use the storage class specifiers extern and static to define thescope of any variable.

[1838] Functions can have the storage class inline to show that they areexpanded in line, rather than being shared.

[1839] Type Qualifiers

[1840] Handel-C supports the type qualifiers const and volatile toincrease compatibility with ISO-C. These can be used to further qualifylogic types.

[1841] Disambiguator

[1842] Handel-C supports the extension <>. This can be used to clarifycomplex declarations of architectural types.

[1843] Logic Types

[1844] The basic logic type is an int. It may be qualified as signed orunsigned. Integers can be manually assigned a width by the programmer orthe compiler may attempt to infer a width from use. Enumeration types(enums) allow one to define a specified set of values that a variable ofthis type may hold. There are derived types (types that are derived fromthe basic types). These are arrays, pointers, structs bit fields, andfunctions. The non-type void enables one to declare empty parameterlists or functions that do not return a value. The typeof type operatorallows one to reference the type of a variable.

[1845] Int

[1846] There is only one fundamental type for variables: int. Bydefault, integers are signed. The int type may be qualified with theunsigned keyword to indicate that the variable only contains positiveintegers or 0. For example:

[1847] int 5 x;

[1848] unsigned int 13 y;

[1849] These two lines declare two variables: a 5-bit signed integer xand a 13-bit non-negative integer y. In the second example here, the intkeyword is optional. Thus, the following two declarations areequivalent.

[1850] unsigned int 6 x;

[1851] unsigned 6 x;

[1852] One may use the signed keyword to make it clear that the defaulttype is used. The following declarations are equivalent.

[1853] int 5 x;

[1854] signed int 5 x;

[1855] signed 5 x;

[1856] The range of an 8-bit signed integer is -128 to 127 while therange of an 8-bit unsigned integer is 0 to 255 inclusive. This isbecause signed integers use 2's complement representation. One maydeclare a number of variables of the same type and width simultaneously.For example:

[1857] int 17 x, y, z;

[1858] This declares three 17-bit wide signed integers x, y and z.

[1859] Supported Types for Porting

[1860] Handel-C provides support for porting from conventional C byallowing the types char, short and long. For example:

[1861] unsigned char w;

[1862] short y;

[1863] unsigned long z;

[1864] The widths assumed for each of these types is as follows: TypeWidth char  8 bits (signed) short 16 bits long 32 bits

[1865] Smaller and more efficient hardware may be produced by only usingvariables of the smallest possible width.

[1866] More about Widths

[1867] The Handel-C compiler can sometimes infer the width of variablesfrom their usage. It is therefore not always necessary to explicitlydefine the width of all variables and the undefined keyword can be usedto tell the compiler to try to infer the width of a variable. Forexample:

[1868] int 6 x;

[1869] int undefined y;

[1870] x = y

[1871] In this example the variable x has been declared to be 6 bitswide and the variable y has been declared with no explicit width. Thecompiler can infer that y may be 6 bits wide from the assignmentoperation later in the program and sets the width of y to this value. Ifthe compiler cannot infer all the undefined widths, it may generateerrors detailing which widths it could not infer. The undefined keywordis optional, so the two definitions below are equivalent:

[1872] int x;

[1873] int undefined x;

[1874] Handel-C provides an extension to allow one to override thisbehavior to ease porting from conventional C. This allows one to set awidth for all variables that have not been assigned a specific width ordeclared as undefined.

[1875] This is done as follows:

[1876] set intwidth = 16;

[1877] int x;

[1878] unsigned int y;.

[1879] This declares a 16-bit wide signed integer x and a 16-bit wideunsigned integer y. Any width may be used in the set intwidthinstruction, including undefined. One can still declare variables thatmay have their width inferred by using the undefined keyword. Forexample:

[1880] set intwidth = 27;

[1881] unsigned x;

[1882] unsigned undefined y;

[1883] This example declares a variable x with a width of 27 bits and avariable y that has its width inferred by the compiler. This examplealso illustrates that the int keyword may be omitted when declaringunsigned integers. One may also set the default width to be undefined:

[1884] set intwidth = undefined;

[1885] Syntax

[1886] [ signed | unsigned ] int [undefined | n ] Name

[1887] Arrays

[1888] One can declare arrays of variables in the same way that arraysare declared in conventional C. For example:

[1889] int 6 x[7];

[1890] This declares 7 registers each of which is 6 bits wide. Accessingthe variables is exactly as in conventional C. For example, to accessthe fifth variable in the array:

[1891] x[4] = 1;

[1892] Note that as in conventional C, the first variable has an indexof 0 and the last has an index of n−1 where n is the total number ofvariables in the array. One can also declare multi-dimensional arrays ofvariables. For example:

[1893] unsigned int 6 x[4] [5] [6];

[1894] This declares 4*5*6=120 variables each of which is 6 bits wide.

[1895] Accessing the variables is as expected from conventional C. Forexample:

[1896] y = x[2] [3] [1];

Example

[1897] This loop initializes all the elements in array ax to the valueof index. unsigned int 6 ax[7]; unsigned int 6 ax [7]; unsigned index;index=0; do { ax [index] = (0 @ index); index++; } while (index <= 6);

[1898] Note that the width of index has to be adjusted in theassignment. This is because its width may be inferred to be 3, from thearray dimension (the array has 7 elements, so “index” may only ever needto count as far as 6).

[1899] Enum

[1900] enum specifies a list of constant integer values, e.g. enumweekdays {MON, TUES, WED, THURS, FRI}; The first name (in this case MON)has a value of 0, the next 1, and so on, unless explicit values arespecified. If not all values are specified, values increment from thelast specified value. To specify enum values enum weekdays {MON=9, TUES,WED, THURS, FRI}; In the beta release, one cannot declare a variable oftype enum,(for example, enum weekdays x; is not allowed). One can assignenum values to a variable (e.g. int x=MON;)

[1901] struct

[1902] struct defines a data structure; a grouping together of variablesunder a single name. The format of the structure can be identified by atype name. The variable members of the structure may be of the same ordifferent types. Once a structure has been declared, its type name canbe used to define other structures of the same type. Structure membersmay be accessed individually using the construct struct_Name.member_Name

[1903] Syntax

[1904] A structure type is declared using the format struct [type_Name}{ member-list } [instance_Names];.

[1905] member-list is a list of variable definitions terminated bysemi-colons. The use of instance_Names declares variables of thatstructure type. Alternatively, one may declare variables as follows:

[1906] struct type_Name instance_Name;

[1907] Storage

[1908] Structures may be passed through channels and signals. Structuresmay be stored in internal memory elements. Structures cannot be storedin offchip rams. If a structure contains a memory element, a channel, ora signal, it cannot be stored in another memory element, it cannot bepassed to a function “by value”, it cannot be assigned to and it cannotbe passed through a channel or a signal. If a structure contains amemory element with more than one member, it cannot be assigned (orassigned to) another structure as the assignment cannot be performed ina single clock cycle. Whole structures may not be sent directly tointerfaces.

[1909] Example struct human // Declare human struct type { unsigned int8 age; // Declare member types int 1 sex; char name [25]; }; // Definehuman type struct human sister; sister.age = 25;

[1910] Bit Field

[1911] A bit field is a type of structure member consisting of aspecified number of bits. The length of each field is separated from thefield name by a colon (:). Each element can be accessed independently.Since Handel-C allows one to specify the width of integers in bits, abit field is merely another way of specifying a standard structure. InISO-C, bit fields are made up of words, and only the specified bits areaccessed, the rest are padded. Padding in ISO-C is implementationdependent. Nothing can be assumed about padding in Handel-C.

[1912] Syntax struct { field_Type field_Name: field_Width ... }

[1913] Example

[1914] This example defines an array of flags named flags as a structureand as a bit field struct structure { unsigned int 1 LED; unsigned int 1signal; unsigned int 1 switch; }outputs; struct bitfield { unsigned intLED : 1; unsigned int signal : 1; unsigned int switch : 1; }signals;union united { unsigned char chis [2]; unsigned short shis; }; unionunited unity; unsigned a; par { unity.chis [0] = 2; unity.chis [1] = 50;} unity.shis = 33;.

[1915] Pointers and Addresses

[1916] Pointers in Handel-C are similar to those in conventional C. Theyprovide the address of a variable or a piece of code. This enables oneto access variables by reference rather than by value. The indirectionoperator * is the same as it is in ISO-C. It is used to declare pointersto objects, and to de-reference pointers (i.e. to access objects pointedto by pointers).

[1917] The “address of” operator (&) works as it does in ISO-C (althoughtechnically Handel-C variables are not usually stored in memorylocations that need to be addressed).

[1918] Pointers

[1919] A pointer declaration consists of the indirection operator (*),the name of the pointer and the type of the variable that it points to.type *Name They are used to point to variables in conjunction with theunary operator &, which gives the address of an object. To set a pointerto point to a variable, one may assign the address of the variable tothe pointer. For example: int 8 *ptr; //declare a pointer to an int 8int 8 object, x; object = 6; x = 10; ptr = &object //assigns the addressof // object to pointer x = *pointer // x is now 6 *pointer = 12;//object is now 12

[1920] In Handel-C, one may only cast null pointers (void * pointerName)to a different type. All other pointers may only be cast to change thesign of an object pointed to, and whether it is const or volatile. Theserestrictions are the standard casting restrictions in Handel-C. One canchange a null pointer's type by casting, assignment or comparison.

[1921] Valid pointer operations include:

[1922] Assign a pointer to another pointer of the same type

[1923] Add or subtract a pointer and an integer

[1924] Subtract or compare a pointer to an array member with anotherpointer to a member of the same array

[1925] Assign or compare a pointer to NULL.

[1926] Pointers to Functions

[1927] If one points to code (a function), the address operator is notrequired. The syntax is returnType (*pointerName)(parameter list) Theparentheses at the end of the declaration declare the pointer to be thepointer to a function. The indirection operator before the pointerNamedeclares it to be a pointer declaration. There is the standard C typeambiguity between the declaration of a function returning a pointer anda pointer to a function. To ensure that indirection operator isassociated with the pointer name rather than the return type, one needsto use parentheses int 8 *functionName( ); //function returning pointerand int 8 (* pointerName)( ); //pointer to function

[1928] operator/& operator

[1929] The indirection operator * is the same as it is in ISO C. It isused to declare pointers to objects, and to de-reference pointers (i.e.to access objects pointed to by pointers). The address operator (&)works as it does in ISO-C (although technically Handel-C variables arenot usually stored in memory locations that need to be addressed).unsigned char cha, chb, *chp; chp = &cha; cha = 90; chb = *chp; chp =&chb;

[1930] The first line declares two unsigned variables (cha and chb), anda pointer to an unsigned (chp). The second line assigns the address ofcha to pointer chp. In other words, pointer chp now points to variablecha. The third line simply assigns a value to cha. The fourth linedereferences pointer chp, to access what it's pointing to, which is cha.In other words, chb is assigned the value of the object pointed to bychp. The last line assigns the address of chb to pointer chp. In otherwords, pointer chp now points to variable chb. The following can also beused: pointers to arrays, pointers to channels, pointers to signals,pointers to memory elements, pointers to structures and unions, pointersto pointers, arrays of pointers. For instance:.

[1931] struct S struct S { int 6 a, b; } s1, s2, *sp, **spp; sp = &s1;spp = &sp; s2 = **spp;

[1932] This declares two variables of type struct S (s1 and s2), apointer to a variable of this type (sp), and a pointer to a pointer to avariable of this type (spp). The next line assigns the address ofstructure s1 to pointer sp (pointer sp to point to structure s1). Thefollowing line assigns the address of pointer sp to pointer spp (pointerspp to point to pointer sp). The last line dereferences pointer spptwice, and it assigns the dereferenced value, which is s1, to structures2 (i.e. s2 now equals s1).

[1933] Structure Pointers

[1934] The structure pointer operator (->) can be used, as in ISO-C. Itis used to access the members of a structure or union, when thestructure/union is referenced through a pointer. struct S { int 18 a, b;} s, *sp; sp = &s; s.a = 26; sp−>b = sp−>a;

[1935] The last line accesses the member variables of structure sthrough pointer sp. Because the pointer is being used to access thestructure, the -> operator is used to refer to the member variables.

[1936] sp->a≡(*sp).q

[1937] One can cast structure pointers between structures with the samemember types. For example: struct S1 { int 6 x; } struct S2 { int 6 y; }struct S1 *strctPtr = &S1; S2.y = (struct S2 *) strctPtr−>y.

[1938] Architectural Types

[1939] The architectural types are channels (used to communicate betweenparallel processes), interfaces (used to connect to pins or providesignals to communicate with external code), memories (rom, ram, wom andmpram) and signal (declares a wire). The disambiguator <> has beenprovided to help clarify the definitions of memories, channels andsignals.

[1940] Channels

[1941] Handel-C provides channels for communicating between parallelbranches of code. One branch writes to a channel and a second branchreads from it. The communication only occurs when both tasks are readyfor the transfer at which point one item of data is transferred betweenthe two branches. Channels are declared with the chan keyword. Forexample:

[1942] chan int 7 link;

[1943] As with variables, the Handel-C compiler can infer the width of achannel from its usage if it is declared with the undefined keyword.Channels can also be declared with no explicit type. The compiler infersthe type and width of the channel from its usage. For example: setintwidth = undefined; chan int Link1; chan unsigned undefined Link2;chan Link3; Syntax chan [ logic Type ] Name

[1944] Arrays of Channels

[1945] Handel-C allows arrays of channels to be declared. For example:

[1946] chan unsigned int 5 x[6];

[1947] This is equivalent to declaring 6 channels each of which is 5bits wide. A channel can be accessed by specifying its index. As withvariable arrays, the index for the nth element is n−1. For example:

[1948] x[4] ! 3; // Output 3 on channel x[4]

[1949] x[3] ? y; // Input to y from channel x[3]

[1950] It is also possible to declare multi-dimensional arrays ofchannels. For example:.

[1951] chan unsigned int 6 x[4][5][6];

[1952] This declares 4*5*6=120 channels each of which is 6 bits wide.Accessing the channels is similar to accessing arrays in conventional C.For example:

[1953] x[2][3][1] ! 4; // Output 4 on channel

[1954] Interfaces

[1955] One may use an interface to communicate with an external deviceor component. An interface consists of data ports, together withinformation about each port. A port definition consists of the data typethat uses it (either defined or inferred from its first use), anoptional name and the specification for that port (e.g., input pins fora bus) if needed.

[1956] Targeting Hardware

[1957] The different varieties of interfaces are known as sorts.Handel-C provides predefined sorts (bus_in, bus_latch_in, bus_clock_in,bus_out, bus_ts, bus_ts_latch_in, bus_ts_clock_in, port_in and port_out.The Handel-C bus sorts (bus_*) generate the hardware for buses connectedto pins. The port_in and port_out sorts generate the hardware forfloating ports (buses which are not connected to pins). These can be ofany width, and can carry signals between different sections of Handel-Ccode, or to software or hardware beyond the Handel-C program. One mayalso define the interface to connect to non-Handel-C objects: Native PCobject code used in simulation. Programs that run on the PC forsimulation and connect to a Handel-C interface are known as plugins.There are special port specifications to enable one to connectuser-defined interfaces with a plugin for simulation. These are extlib,extfunc, extpath and extinst. Hardware descriptions written in anotherlanguage. Currently only VHDL and EDIF are supported. For a VHDL codeinterface, the interface sort would be the name of the VHDL entity.

[1958] The style of interface declaration used in Handel-C Version 2 isdeprecated, but remains for backward compatibility. The recommendedstyle is to declare an interface sort and then to define instances ofthat sort. The interface declaration gives the port names and types butno further details about them. The interface definition gives the port.specifications (if needed) and assigns data to be transmitted to theoutput ports. interface declaration interface Sort( {data_TO _hc} ) ({send_FROM_hc} )

[1959] Sort may be a user-defined name or one of the pre-defined sorts(bus_in, bus_latch_in, bus_clock_in, bus_out, bus_ts, bus_ts_latch_in,bus_ts_clock_in, port_in and port_out). data_TO_hc is optional. Itconsists of one or more prototypes of ports bringing data TO theHandel-C code from the outside world. A port prototype consists of theport type, and the port name send_FROM_hc is optional. It consists ofone or more definitions of ports carrying data FROM the Handel-C code tothe outside world (port definition as above). At least one port (whetherto Handel-C or from Handel-C) may be declared. interface definitioninterface Sort({port_TO_hc [with {portSpec}]} )Name({port_FROM_hc=outputDataItem [with {portSpec}] }) with{generalSpecs};

[1960] Sort is a pre-declared interface sort (as above).

[1961] port_TO_hc consists of definitions of the ports bringing data toHandel-C that were prototyped in the interface declaration. These portsmay have the type given in the prototype, but may also have portspecifications. The most likely use of a port specification is if onewere interfacing with an external DLL (dynamic linked library) andneeded to specify the external function that this port required(extfunc). Name is a user-defined identifier for that instance of theinterface port_FROM_hc consists of definitions of ports sending datafrom the Handel-C code that were prototyped in the sort declaration.These ports may have the type given in the prototype, but may also haveport specifications. Each port_FROM_hc port should be assigned anexpression outputDataItem. The value of outputDataItem may be sent tothat port. with {generalspecs} is optional. It consists of one or moreport specifications that apply to all the ports within the interface.One might wish to specify the external simulator that handles this typeof port (generates input and receives output) using the extlibdirective.

[1962]FIG. 56 illustrates the various specifications 5600 for theinterfaces of the present invention.

[1963] Note that ports to the code precede the interface Name and portsfrom it follow it.

[1964] Example

[1965] Further examples of bus interfaces are given later. The presentexample shows an interface declaration used to connect to a piece offoreign code, and the definition that uses this declaration. //Interface declaration interface ttl7446 (unsigned 7 segments, unsigned 1rbon) (unsigned 1 ltn, unsigned 1 rbin, unsigned 4 digit, unsigned 1bin); // Interface definition interface ttl7446 (unsigned 7 segments,unsigned 1 rbon) decode (unsigned 1 ltn=ltnVal, unsigned 1 rbin=rbinVal,unsigned 4 digit=digitVal, unsigned 1 bin=binVal) with{extlib=“PluginModelSim.dll”, extinst=“decode; model=ttl7446_wrapper;delay=1”};.

[1966] Internal RAMs and ROMs

[1967] RAMs and ROMs may be built from the logic provided in the FPGAusing the ram and rom keywords. For example:

[1968] ram int 6 a[43];

[1969] rom int 16 b[4];={23, 46, 69, 92 };

[1970] This example constructs a RAM consisting of 43 entries each ofwhich is 6 bits wide and a ROM consisting of 4 entries each of which is16 bits wide.

[1971] To initialize a static or global ROM, one can use the format

[1972] rom int 16 b[4] { 23, 46, 69, 92 };

[1973] The ROM is initialized with the constants given in the followinglist in much the same way as an array would be initialized in C. In thisexample, the ROM entries are given the following values: ROM entry Valueb[0] 23 b[1] 46 b[2] 69 b[3] 92

[1974] The Handel-C compiler can also infer the widths, types and thenumber of entries in RAMs and ROMs from their usage. Thus, it is notalways necessary to explicitly declare these attributes. For example:

[1975] ram int undefined a[123];

[1976] ram int 6 b[ ];

[1977] ram c[43];

[1978] ram d[ ];

[1979] RAMs and ROMs are accessed in much the same way as arrays. Forexample:

[1980] ram int 6 b[56];

[1981] b[7] =4;

[1982] This sets the eighth entry of the RAM to the value 4. Note thatas in conventional C, the first entry in the memory has an index of 0and the last has an index of n−1 where n is the total number of entriesin the memory.

[1983] Note that RAMs differ from arrays in that an array is equivalentto declaring a number of variables. Each entry in an array may be usedexactly like an individual variable with as many reads and writes in aclock cycle as required. RAMs, however, are normally more efficient toimplement in terms of hardware resources than arrays. Therefore, oneshould use an array when he or she wishes to access the elements morethan once in parallel and he or she should use a RAM when he or sheneeds efficiency. Accessing internal RAMs can only be done in the way-described above on Altera or Xilinx devices with synchronous on-chipRAMs. This includes Altera Flex 10K and APEX, Xilinx 4000E, 4000EX,4000L, 4000XL, 4000XV, Spartan, Spartan II and Virtex 10K seriesdevices. Other memories may require timing specifications.

[1984] RAMs and ROMs may only have one entry accessed in any clockcycle. This restriction is discussed in more detail later.

[1985] Multidimensional Arrays

[1986] It is possible to create simple multi-dimensional arrays ofmemory using the ram, rom and wom keywords. The definitions can be madeclearer by using the optional disambiguator <>.

[1987] Syntax

[1988] ram | rom | wom logicType entry_width Name {[const_expression]}

[1989] [={initialisation strings}];

[1990] Possible logic types are ints, structs, pointers and arrays. Thelast constant expression is the index for the RAM. The other indicesgive the number of copies of that type of RAM.

[1991] Example ram <int 6> a [15] [43]; rom <int 16> b [4] [2] [2] = {{{1, 2}, {3, 4} }, {{5, 6}, {7, 8} }, {{9, 10}, {11, 12} }, {{13, 14},{15, 16} } };

[1992] This example constructs 15 RAMs, each consisting of 43 entries of6 bits wide and 4 * 2 ROMs, each consisting of 2 entries of 16 bitswide. The ROM is initialized with the constants in the following list inthe same way as a multidimensional array would be initialized in C. Thelast index (that of the RAM entry) changes fastest. FIG. 57 illustratesa table 5700 showing the ROM entries, in accordance with one embodimentof the present invention.

[1993] Because of their architecture, RAMs and ROMs are restricted toperforming operations sequentially. Only one element of a RAM or ROM maybe addressed in any given clock cycle and, as a result, familiar lookingstatements are often disallowed. For example:

[1994] ram <unsigned int 8> x[4];

[1995] x[1] = x[3] + 1;

[1996] This code is illegal because the assignment attempts to read fromthe third element of x in the same cycle as it writes to the firstelement. In a multi-dimensional array, one can access separate elementsof the arrays, so long as he or she is not accessing the same RAM (thepenultimate array index). For example:

[1997] x[2][1]=x[3][1] is valid

[1998] x[2][1]=x[2][0] is invalid

[1999] Note that arrays of variables do not have these restrictions butmay require substantially more hardware to implement than RAMs dependingon the target architecture.

[2000] mpram (Multi-Ported RAMS)

[2001] One can create multiple-ported RAMs (MPRAMs) by constructingsomething like an ISO-C union. One may use the mpram keyword. mprams canbe used to connect two independent code blocks. The clock of the mpramport is taken from the function in which it is used. The normaldeclaration of a MPRAM would be to create a dual-ported RAM by declaringtwo ports of equal width: for Altera, one port would be read-only andone write-only, for Xilinx 4000 one port would be read/write and oneread-only and for Virtex, both ports would be read/write. Syntax mpramMPRAM_name { ram_Type variable_Type RAM_Name [width]; ram_Typevariable_Type RAM_Name [width]; };

[2002] Example

[2003] Using an mpram to communicate between two independent logicblocks: File 1: mpram Fred { ram <unsigned 8> ReadWrite [256]; //Read/write port rom <unsigned 8> Read [256]; // Read only port }; mpramFred Joan; /*Declare Joan as an mpram like Fred */ set clock = internal“F8M” void main (void) { unsigned 8 data; Joan.ReadWrite [7] = data; }File 2: mpram Fred { ram <unsigned 8> ReadWrite [256]; // Read/writeport rom <unsigned 8> Read [256]; // Read only port }; extern mpram FredJoan; set clock = external “P2” void main (void) { unsigned 8 data;data= Joan.Read [7];

[2004] Mapping of Different Width Ports

[2005] If the ports of the mpram are of different widths, they may bemapped onto each other according to the specifications of the chip aperson is using. If the ports used are of different widths, the widthsshould have values of 2 n. Different width ports are not available withAltera devices.

[2006] Xilinx Bit Mapping

[2007] To find the bits that an array element occupies in a XilinxVirtex or 4000 series RAM, one can use the formula RAM array ram yName[a] may have a start bit of ((y+1)*a)−1 and an end bit of y*a.Xilinx mapping is little-endian. This means that the address points tothe LSB. The bits between the declarations of RAM are mapped directlyacross, so that bit 27 in one declaration may have the same value as bit27 in another declaration, even though the bits may be in differentarray elements in the different declarations. mpram Joan { ram <unsigned4> ReadWrite [256]; // Read/write port rom <unsigned 8> Read [256]; //Read only port }; Joan.ReadWrite [100] may run from 400 to 403.Joan.Read [100] may run from 800 to 807. Joan.Read [50] may run from 400to 407. Joan.ReadWrite [100] is equivalent to Joan.Read [50] [0 : 3]

[2008] Initialisation of mprams

[2009] The first member of the mpram can be initialized. mpram Fred {ram <unsigned 8> ReadWrite [256]; // Read/write port rom <unsigned 8>Read [256]; // Read only port } Mary ={10,11,12,13};

[2010] This would have the effect

[2011] Fred.ReadWrite[0]=10 Fred.ReadWrite[1]=11

[2012] Fred.ReadWrite[2]=12 Fred.ReadWrite[3]=13

[2013] The other elements of Fred.ReadWrite may not be initialized. Inthis case, since Fred.Read is the same size as Fred.ReadWrite, elements0-3 of Fred.Read would be initialized with the same values.

[2014] wom (Write-Only Memory)

[2015] One can declare a write-only memory using the keyword wom. Theonly use of a write-only memory would be to declare an element within amulti-ported RAM. Since woms only exist inside multi-port rams, it isillegal to declare one outside a mpram declaration. Syntax womvariable_Type variable_Size WOM_Name [width] = initialise_Values [with{specs}] Example mpram connect { wom <unsigned 8> Writeonly[256]; //Write only port rom <unsigned 8> Read[256]; // Read only port };

[2016] Signal

[2017]FIG. 57A illustrates a method 5740 for using a dynamic object,i.e. signal, in a programming language. In general, in operation 5742,an object, i.e. signal, is defined with an associated first value andsecond value. The first value is then used in association with theobject during a predetermined clock cycle. See operation 5744. Thesecond value is used in association with the object before or after thepredetermined clock cycle, as indicated in operation 5746.

[2018] In an aspect of the present invention, the object may be used tosplit up an expression into sub-expressions. As an option, thesub-expressions may be reused. In another aspect, the first value may beassigned to and read from the object during the predetermined clockcycle. More information regarding the above concept will now be setforth in greater detail.

[2019] A signal is an object that takes on the value assigned to it butonly for that clock cycle. The value assigned to it can be read backduring the same clock cycle. At all other times it takes on itsinitialisation value. The default initialisation value is 0. Theoptional disambiguator < > can be used to clarify complex signaldefinitions. Syntax signal |<type data-width>] signal_Name; Example int15 a, b; signal <int> sig; a = 7; par { sig = a; b = sig; }

[2020] sig is assigned to and read from in the same clock cycle, so b isassigned the value of a. Since the signal only holds the value assignedto it for a single clock cycle, if it is read from just before or justafter it is assigned to, one gets its initial value. For example: int 15a, b; static signal <int> sig = 690; a = 7; par { sig = a; b = sig; } a= sig;

[2021] Here, b is assigned the value of a through the signal, as before.Since there is a clock tick before the last line, a is finally assignedthe signal's initial value of 690.

[2022] Using Signals to Split up Complex Expressions

[2023] One can split up complex expressions. E.g.,b=(((a*2)−55)<<2)+100; could also be written int 17 a, b; signal s1, s2,s3, s4; par { s1 = a; s2 = s1 * 2; s3 = s2 − 55; s4 = s3 << 2; b = s4 +100; }

[2024] Breaking up expressions also enables one to re-usesub-expressions: unsigned 15 a, b; signal sig1; par { sig1 = x + 2; a =sig1 * 3; b = sig1 /2;

[2025] Type Qualifiers

[2026] Handel-C supports the type-qualifiers const and volatile toincrease compatibility with ISO-C. These can be used to further qualifylogic types.

[2027] Const

[2028] const defines a variable or an array of variables that cannot beassigned to. This means that they keep the initialisation valuethroughout. They may be initialized in the declaration statement. Theconst keyword can be used instead of #define to declare constant values.It can also be used to define function parameters which are nevermodified. The compiler may perform type-checking on const variables andprevent the programmer from modifying it.

[2029] Example const int i = 5; i = 10; // Error i++; // Error

[2030] Volatile

[2031] In ISO-C, volatile is used to declare a variable that can bemodified by something other than the program. It is mostly used forhard-wired registers. volatile controls optimization by forcing are-read of the variable. It is only a guide, and may be ignored. Theinitial value of volatile variables is undefined. Handel-C does nothingwith volatile. It is accepted for compatibility purposes.

[2032] Complex Declarations

[2033] It is possible to have extremely complex declarations inHandel-C. One can combine arrays of functions, structs, arrays, andpointers with architectural types. To clarify such expressions, it iswise to use typedef.

[2034] Macro Expressions in Widths

[2035] If one uses a macro expression to provide the width in a typedeclaration, one may enclose it in parentheses. This ensures that it maybe correctly parsed as a macro.

[2036] int (mac(x)) y;

[2037] To declare a pointer to a function returning that type, one gets:

[2038] int (mac(x) (*f)( );

[2039] (type clarifier)

[2040] < > is a Handel-C extension used to disambiguate complexdeclarations of architectural types. One cannot use it on logic types.It is good practice to use it whenever a person declares channels,memories or signals, to clarify the format of data passed or stored inthese variables.

[2041] Example struct fishtank { int 4 koi; int 8 carp; int 2 guppy; }bowl; signal <struct fishtank> drip; chan <int 8 (*runwater)()> tap;

[2042] It is required to disambiguate a declaration such as:

[2043] chan int *x //pointer to channel or

[2044] //channel of pointers?

[2045] This should be declared as

[2046] chan <int *> x //channel of pointers

[2047] or

[2048] chan <int> *x //pointer to channel.

[2049] Storage Class Specifiers

[2050] Storage class specifiers define how variables are accessed. Forcompatibility with ISO-C, the specifiers auto and register can be usedbut have no effect. The scope of a variable is declared by thespecifiers extern and static. The expansion of a function is defined bythe specifier inline. The typedef specifier allows one to declare newnames for existing types.

[2051] Auto

[2052] auto defines a local automatic variable. In Handel-C, all localvariables default to auto. One cannot initialize an auto variable, butmay assign it a value. The initialisation status of auto variables isundefined.

[2053] Example auto pig; pig = 15;

[2054] Extern

[2055] extern declares a variable that can be accessed by name from anyfunction. Extern variables may be defined once outside all functions.(By default, any variable declared outside a function is assumed to beextern.)

[2056] If the variable is used in multiple source files, it is goodpractice to collect all the extem declarations in a header file,included at the top of each source file using the #includeheaderFileName directive. Note that one cannot access the same variablefrom different clock domains.

[2057] Example extern int 16 global_fish; int global_frog = 1234; main(){ global_fish = global_frog; . . . }

[2058] Syntax

[2059] extern variable declaration;

[2060] functionName(parameter-type-list)

[2061] Inline

[2062] inline causes a function to be expanded where it is called. Thelogic may be generated every time it is invoked. This ensures that thefunction is not accessed at the same time by parallel branches of code.By default, functions are assumed to be shared (not inline).

[2063] Example inline int4 knit(int needle, int stitch) { needle =needle + stitch; return(needle); } int 4jumper[100]; par(needle = 1;needle < 100; needle = needle+2) { jumper[needle] = knit(needle, 1); }

[2064] Syntax

[2065] inline function_Declaration;

[2066] Register

[2067] register has been implemented for reasons of compatibility withISO-C. register defines a variable that has local scope. Its initialvalue is undefined.

[2068] Example

[2069] register int 16 fish;

[2070] fish = f(plop);

[2071] Static

[2072] static gives a variable static storage (its values are kept atall times). This ensures that the value of a variable is preservedacross function calls. It also affects the scope of a variable or afunction. Static functions and static variables declared outsidefunctions can only be used in the file in which they appear. staticvariables declared within an inline finction or an array of functionscan only be used in the copy of the function in which they appear.static variables are the only local variables (excluding consts) thatcan be initialized.

[2073] Example static int 16 local_function(int water, int weed) staticint 16 local_fish = 1234; main( ) { int fresh, pondweed; local_fish =local_function(fresh, pondweed); . . . }

[2074] Syntax

[2075] static variable declaration;

[2076] static functionName(parameter-type-list)

[2077] typedef

[2078] typedef defines another name for a variable type. This allows oneto clarify the code. The new name is a synonym for the variable type.

[2079] typedef int 4 SMALL_FISH;

[2080] If the typedef is used in multiple source files, it is goodpractice to collect all the type definitions in a header file, includedat the top of each source file using the #include headerFileNamedirective. It is conventional to differentiate typedef names fromstandard variable names, so that they are easily recognizable.

[2081] Example typedef int 4 SMALL_FISH; extern SMALL_FISH stickleback;

[2082] typeof

[2083] The typeof type operator allows the type of an object to bedetermined at compile time. The argument to typeof may be an expression.Using typeof ensures that related variables maintain their relationship.It makes it easy to modify code by simplifying the process of sortingout type and width conflicts. A typeof-construct can be used anywhere atype name could be used. For example, one can use it in a declaration,in casts or inside typeof.

[2084] Syntax

[2085] typeof (expression )

[2086] Example unsigned 9 ch; typeof(ch @ ch) q; struct { typeof(ch)cha, chb; } sl; typeof(s1) s2; ch = s1.cha + s2.chb; q = s1.chb @s2.cha;

[2087] If the width of variable ch were changed in this example, therewould be no need to modify any other code. This is also useful forpassing parameters to macro procs. The code below shows how to use atypeof definition to deal with multiple parameter types. macro proc swap(a, b) { typeof(a) t; t=a; a=b; b=t;

[2088] Variable Initialization

[2089] Global variables (i.e. those declared outside all code blocks)may be initialized with their declaration. For example:

[2090] int 15 x = 1234;

[2091] Variables declared within functions can only be initialized ifthey have static storage or are consts. All other variables may not beinitialized this way. Instead, one may use an explicit sequential orparallel list of assignments following the declarations to achieve thesame effect. For example: { int 4 x; unsigned 5 y; x = 5; y = 4; }

[2092] Global and static variables may only be initialized withconstants.

[2093] Statements

[2094] Introduction

[2095] As with conventional C, the execution flow of a Handel-C programis expressed as a series of statements such as assignment, conditionalexecution and iteration. Handel-C includes most of the statements fromconventional C and these are detailed below.

[2096] Sequential and Parallel Execution

[2097]FIG. 57A-1 illustrates a method 5730 for using extensions toexecute commands in parallel. In general, in operation 5732, a pluralityof commands to be executed in parallel are designated.

[2098] This designation is replicated in operation 5734, and thecommands are executed in parallel recursively. Note operation 5736. Inone aspect, the commands may be executed in parallel recursivelyutilizing a FOR loop.

[2099] As an option, a first command may be executed simultaneously witha second command. Further, the first command may be executedsimultaneously with the second command in a single clock cycle.

[2100] Handel-C implicitly executes instructions sequentially but whentargeting hardware it is extremely important to make as much use aspossible of parallelism. For this reason, Handel-C also has a parallelcomposition keyword par to allow statements in a block to be executed inparallel.

[2101] The following example executes three assignments sequentially:

[2102] x = 1;

[2103] y = 2;

[2104] z = 3;

[2105] In contrast, the following example executes all three assignmentsin parallel and in the same clock cycle:

[2106] par

[2107] {

[2108] x = 1;

[2109] y = 2;

[2110] z = 3;

[2111] }

[2112] It should be noted that the second example executes allassignments literally in parallel. This is not the time-sliced pseudoparallelism of a conventional microprocessor implementation but threespecific pieces of hardware built to perform these three assignments.Detailed timing analysis may be dealt later, but for now it is enough tostate that the first example executes in 3 clock cycles while the secondgenerates a similar quantity of hardware but executes in 1 clock cycle.Therefore, it is obvious that parallelism is a very important constructfor targeting hardware. Within parallel blocks of code, sequentialbranches can be added by using a code block denoted with the {. . .}brackets instead of a single statement. For example:

[2113] par

[2114] {

[2115] x = 1;

[2116] {

[2117] y = 2;

[2118] z = 3;

[2119] }

[2120] }

[2121] In this example, the first branch of the parallel statementexecutes the assignment to x while the second branch sequentiallyexecutes the assignments to y and z. The assignments to x and y occur inthe same clock cycle, the assignment to z occurs in the next clockcycle. The instruction following the par {. . .} may not be executeduntil all branches of the parallel block complete.

[2122] Seq

[2123] To allow replication, the seq keyword exists. Sequentialstatements can be written with or without the keyword. The followingexample executes three assignments sequentially:

[2124] x = 1;

[2125] y = 2;

[2126] z = 3;

[2127] as does this:

[2128] seq

[2129] {

[2130] x = 1;

[2131] y = 2;

[2132] z = 3;

[2133] }

[2134] Replicated par and seq

[2135] One can replicate par and seq blocks by using a counted loop (asimilar construct to a for loop). The count is defined with a startpoint (index_Base below), an end point (index_Limit) and a step size(index_Count). The body of the loop is replicated as many times as thereare steps between the start and end points. If it is a par loop, thereplicated processes may run in parallel, if a seq, they may runsequentially.

[2136] Syntax par | seq (index_Base; index_Limit; index_Count) {. Body }index_Base, index_Limit and index_Count are macro exprs that areimplicitly declared. They do not need to be single expressions, forexample, one could declare par (i=0, j=23; i !=76; i++, j--).

[2137] Example par(i=0; i<3; i++) { a[i] = b[i]; } expands to: par {a[0] = b[0]; a[1] = b[1]; a[2] = b[2]; } Replicated pipeline unsignedinit; unsigned q[149]; unsigned 31 out; init = 57; par(r = 0; r < 16;r++) { ifselect(r == 0) q[r] = init; else ifselect(r == 15) out =q[r−1]; else q[r] = q[r−1]; }

[2138] ifselect checks for the start of the pipeline, the replicatorrules create the middle sections and ifselect checks the end. This codeexpands to: par { q[0] = init; q[1] = q[0]; q[2] = q[1]; etc . . . q[14]= q[13]; out = q[14]; }.

[2139] Assert

[2140] assert allows one to generate messages at compile-time if acondition is met. They can be used to check compile-time constants andhelp guard against possible problematic code alterations. The user usesan expression to check the value of a compile-time constant, and if theexpression evaluates to false, an error message is sent to the standarderror channel in the format filename:(line number):(column number)::Assertion failed: user-defined error string

[2141] Syntax

[2142] assert(condition, [string with format specification(s),{argument(s)}]); If condition is false, string may be sent to thestandard error channel, with each format specification replaced by anargument. When assert encounters the first format specification (ifany), it converts the value of the first argument into that format andoutputs it. The second argument is formatted according to the secondformat specification and so on. If there are more expressions thanformat specifications, the extra expressions are ignored. The resultsare undefined if there are not enough arguments for all the formatspecifications.

[2143] The format specification is one of:

[2144] %c Display as a character %s Display as a string

[2145] %d Display as a decimal %f Display as a floating point

[2146] %o Display as an octal %x Display as a hexadecimal

[2147] Example: int f(int x) { assert(width(x)==3, “Width of x is not 3(it is %d)”, width(x)); return x+1; } void main(void) { int 4 y; y =f(y); }

[2148] x may be inferred to have a width of 4, so the following messagemay be displayed. F:\proj\test.c(4)(2): Assertion failed : Width of x isnot 3 (it is 4).

[2149] Continue

[2150] continue moves straight to the next iteration of a for, while ordo loop. For do or while, this means that the test is executedimmediately. In a for statement, the increment step is executed. Thisallows one to avoid deeply nested if . . . else statements within loops

[2151] Example for(i = 100; i > 0; i—) { x = f( i ); if( x == 1 )continue; y += x * x; }

[2152] Goto

[2153] goto label moves straight to the statement specified by label.label has the same format as a variable name, and may be in the samefunction as the goto. Labels have function scope. Formally, goto isnever necessary. It may be useful for extracting from deeply nestedlevels of code in case of error.

[2154] Example for( . . . ) { for( . . . ) { if(disaster) goto Error; }} Error: output ! error_code;

[2155] Return [Expression]

[2156] The return statement is used to return from a function to itscaller. return terminates the function and returns control to thecalling function. Execution resumes at the line immediately followingthe function call. return can return a value to the calling function.The value returned is of the type declared in the function declaration.Functions that do not return a value should be declared to be of typevoid.

[2157] Example int power(int base, int n) { int i, p; p = 1; for(i = 1;i <= n; ++i) p = p * base; return(p); }

[2158] Assignments

[2159] Handel-C assignments are of the form:

[2160] Variable = Expression;

[2161] For example:

[2162] x = 3;

[2163] y = a + b;

[2164] The expression on the right hand side may be of the same widthand type (signed or unsigned) as the variable on the left hand side. Thecompiler generates an error if this is not the case. The left hand sideof the assignment may be any variable, array element or RAM element. Theright hand side of the assignment may be any expression described later.Handel-C also provides a number of short cut assignment statements. Notethat these cannot be used in expressions as they can in conventional Cbut only in stand-alone statements. These short cuts are: StatementExpansion Variable ++; Variable = Variable + 1; Variable −−; Variable =Variable − 1; ++ Variable; Variable = Variable + 1; −− Variable;Variable = Variable − 1; Variable += Expression; Variable = Variable +Expression; Variable −= Expression; Variable = Variable − Expression;.Variable *= Expression; Variable = Variable * Expression; Variable /=Expression; Variable = Variable / Expression; Variable %= Expression;Variable = Variable % Expression; Variable <<= Expression; Variable =Variable << Expression; Variable >>= Expression; Variable = Variable >>Expression; Variable &= Expression; Variable = Variable & Expression;Variable |= Expression; Variable = Variable | Expression; Variable{circumflex over ( )}= Expression; Variable = Variable {circumflex over( )} Expression;

[2165] Channel Communication

[2166] Channels are a way of communicating between processes. When onewrites to a channel, a copy of the data he or she writes is sent to thereceiving process. This allows information to be shared betweenprocesses. Since a variable cannot be written to by multiple processes,one can write to the variable in a single process by reading channelsthat send data from other processes. Each channel may be written to atone end, and read from at the other. The width and type of data sentdown the channel may be the same of the width and type of the channel.The channel can be an entry in an array of channels, or be pointed to bya channel pointer.

[2167] As with other variables, if no width or type is given to achannel, (or if it is set as undefined), the compiler can infer thechannel width and type from its use. Reading from a channel is done asfollows:

[2168] Channel ? Variable;

[2169] This assigns the value read from the channel to the variable. Thevariable may also be a signal, an array element, RAM element or WOMelement.

[2170] Writing to a channel is as follows:

[2171] Channel ! Expression;

[2172] This writes the value of the expression to the channel.Expression may be any expression described later. No two statements maysimultaneously write to or simultaneously read from a single channel.par { out ! 3 // Parallel write to a channel out ! 4 }

[2173] This code is illegal as it attempts to write simultaneously to asingle channel. Similarly, the following code is illegal because anattempt is made to read simultaneously from the same channel: par { in ?x; // Parallel read from a channel in ? y; } Example set clock =external; void main(void) { signal Fred; unsigned 8 Res; chan Bill; par{ Bill ! 23; Bill ? Fred; Res = Fred; } }

[2174] prialt

[2175] The prialt statement selects the first channel ready tocommunicate. The syntax is similar to a conventional C switch statement.prialt { case CommsStatement: Statement break; ...... caseCommsStatement: Statement break; ...... default: Statement break; }.

[2176] prialt selects between the communications on several channelsdepending on the readiness of the other end of the channel.

[2177] CommsStatement may be one of the following:

[2178] Channel ? Variable

[2179] Channel ! Expression

[2180] The case whose communication statement is the first to be readyto transfer data may execute and data may be transferred over thechannel. The statements up to the next break statement may then beexecuted. The prialt construct does not allow the same channel to belisted twice in its cases and fall through of cases is prohibited. Thismeans that each case may have its own break statement. If two channelsare ready simultaneously, then the first one listed in the code takespriority.

[2181] Default

[2182] prialt with no default case:

[2183] execution halts until one of the channels becomes ready tocommunicate.

[2184] prialt statement with default case:

[2185] if none of the channels is ready to communicate immediately thenthe default branch statements executes and the prialt statementterminates.

[2186] Conditional execution (if . . . else)

[2187] Handel-C provides the standard C conditional execution constructas follows:

[2188] if(Expression)

[2189] Statement

[2190] else

[2191] Statement

[2192] As in conventional C, the else portion may be omitted if notrequired. For example:

[2193] if(x == 1)

[2194] x = x+1;

[2195] Here, and throughout the rest of the present description,Statement may be replaced with a block of statements by enclosing theblock in {. . . } brackets. For example:

[2196] if (x>y)

[2197] {

[2198] a = b;

[2199] c = d;

[2200] }

[2201] else

[2202] {

[2203] a = d;

[2204] c = b;

[2205] }

[2206] The first branch of the conditional is executed if the expressionis true and the second branch is executed if the expression is false.Handel-C treats zero values as false and non-zero values as true. As maybe seen later, the relational logical operators return values to matchthis meaning but it is also possible to use variables as conditions. Forexample:

[2207] if (x)

[2208] a = b;

[2209] else

[2210] c = d;

[2211] This is expanded by the compiler to:

[2212] if (x!=0)

[2213] a = b;

[2214] else

[2215] c = d;

[2216] When executed, if x is not equal to 0 then b is assigned to a. Ifx is 0 then d is assigned to c.

[2217] While Loops

[2218] Handel-C provides while loops exactly as in conventional C:

[2219] While (Expression)

[2220] Statement

[2221] The contents of the while loop may be executed zero or more timesdepending on the value of Expression. While Expression is true thenStatement is executed repeatedly. Again, Statement may be replaced witha block of statements. For example:

[2222] x = 0;

[2223] while (x !=45)

[2224] {

[2225] y = y + 5;

[2226] x = x + 1;

[2227] }

[2228] This code adds 5 to y 45 times (equivalent to adding 225 to y).

[2229] do . . . while loops

[2230] Handel-C provides do . . . while loops exactly as in conventionalC:

[2231] Do

[2232] Statement

[2233] While (Expression);

[2234] The contents of the do . . . while loop is executed at least oncebecause the conditional expression is evaluated at the end of the looprather than at the beginning as is the case with while loops. Again,Statement may be replaced with a block of statements. For example:

[2235] do

[2236] {

[2237] a = a + b;

[2238] x= x − 1;

[2239] } while (x>y);

[2240] For Loops

[2241] Handel-C provides for loops similar to those in conventional C.

[2242] for (Initialisation; Test; Iteration)

[2243] Statement

[2244] The body of the for loop may be executed zero or more timesaccording to the results of the condition test. There is a directcorrespondence between for loops and while loops.

[2245] for (Init; Test; Inc)

[2246] Body;

[2247] Is directly equivalent to:

[2248] {

[2249] Init;

[2250] while (Test)

[2251] {

[2252] Body;

[2253] Inc;

[2254] }

[2255] }

[2256] unless the Body includes a continue statement. In a for loopcontinue jumps to before the increment, in a while loop continue jumpsto after the increment. Each of the initialisation, test and iterationstatements is optional and may be omitted if not required. As with allother Handel-C constructs, Statement may be replaced with a block ofstatements. For example:

[2257] for (; x>y; x++ )

[2258] {

[2259] a = b;

[2260] c = d;

[2261] }

[2262] The difference between a conventional C for loop and the Handel-Cversion is in the initialisation and iteration phases. In conventionalC, these two fields contain expressions and by using expression sideeffects (such as ++and −−) and the sequential operator ‘,’ conventionalC allows complex operations to be performed. Since Handel-C does notallow side effects in expressions the initialisation and iterationexpressions have been replaced with statements. For example:

[2263] for (x = 0; x < 20; x = x+1)

[2264] {

[2265] y = y+ 2;

[2266] }

[2267] Here, the assignment of 0 to x and adding one to x are bothstatements and not expressions. These initialisation and iterationstatements can be replaced with blocks of statements by enclosing theblock in {. . . } brackets. For example:

[2268] for ({ x=0; y=23;} ; x < 20; {x+=1; x*=2;})

[2269] {

[2270] y = y + 2;

[2271] }

[2272] Switch

[2273] Handel-C provides switch statements similar to those inconventional C. switch (Expression) { case Constant: Statement break;...... default: Statement break; }

[2274] The switch expression is evaluated and checked against each ofthe case compile time constants. The statement(s) guarded by thematching constant is executed until a break statement is encountered. Ifno matches are found, the default statement is executed. If no defaultoption is provided, no statements are executed.

[2275] Each of the Statement lines above may be replaced with a block ofstatements by enclosing the block in {. . .} brackets. As withconventional C, it is possible to make execution drop through casebranches by omitting a break statement. For example:

[2276] switch (x)

[2277] {

[2278] case 10:

[2279] a = b;

[2280] case 11:

[2281] c = d;

[2282] break;

[2283] case 12:

[2284] e = f;

[2285] break;

[2286] }

[2287] Here, if x is 10, b is assigned to a and d is assigned to c, if xis 11, d is assigned to c and if x is 12, f is assigned to e.

[2288] The values following each case branch may be compile timeconstants.

[2289] Break

[2290] Handel-C provides the normal C breaks statement both forterminating loops and separation of case branches in switch and prialtstatements.

[2291] When used within a while, do . . . while or for loop, the loop isterminated and execution continues from the statement following theloop. For example:

[2292] for (x=0; x<32; x++)

[2293] {

[2294] if (a[x]==0)

[2295] break;

[2296] b[x]=a[x];

[2297] }

[2298] // Execution continues here

[2299] When used within a switch statement, execution of the case branchterminates and the statement following the switch is executed. Forexample: switch (x) { case 1: case 2: y++; break; case 3: z++; break; }// Execution continues here

[2300] When used within a prialt statement, execution of the case branchterminates and the statement following the prialt is executed. Forexample: prialt { case a ? x: x++; break; case b ! y: y++; break; } //Execution continues here Example int power(int base, int n) { int i, p;p = 1; for(i = 1; i <= n; ++i) p = p * base; return(p); }

[2301] Delay

[2302] Handel-C provides a delay statement not found in conventional Cwhich does nothing but takes one clock cycle to do it. This may beuseful to avoid resource conflicts (for example to prevent two accessesto one RAM in a single clock cycle) or to adjust execution timing. Delaycan also be used to break combinatorial logic cycles.

[2303] Address and Indirection

[2304] The address operator (&) is used to access the address of avariable. The indirection operator * is the same as it is in ISO-C. Itis used to declare pointers to objects, and to de-reference pointers(i.e. to access objects pointed to by pointers).

[2305] Member Operators

[2306] The structure member operator (.) is used to access members of astructure or mpram, or to access a port within an interface. Thestructure pointer operator (->) can be used, as in ISO-C. It is used toaccess the members of a structure or mpram, when the structure/mpram isreferenced through a pointer. mpram Fred { ram <unsigned 8>ReadWrite[256]; // Read/write port rom <unsigned 8> Read[256]; // Readonly port } Joan; mpram Fred *mpramPtr; mpramPtr = &Joan; x =mpramPtr−>Read[56];

[2307] If a memory is made up of structures, the structure memberoperator can be used to reference structure members within the memory.

[2308] ram struct S compRAM[100];

[2309] ram struct S (*ramStructPtr)[ ];

[2310] ramStructPtr = &compRAM;

[2311] x = (*ramStructPtr)[10].a;.

[2312] Expressions

[2313] Introduction

[2314] Expressions in Handel-C take no clock cycles to be evaluated, andso have no bearing on the number of clock cycles a given program takesto execute. They do affect the maximum possible clock rate for aprogram—the more complex an expression, the more hardware is involved inits evaluation and the longer it is likely to take because ofcombinatorial delays in the hardware. The clock period for the entirehardware program is limited by the longest such evaluation in the wholeprogram. More details on timing and efficiency considerations will beset forth hereinafter in greater detail. Because expressions are notallowed to take any clock cycles, expressions with side effects are notpermitted in Handel-C. For example;

[2315] a = b++; /* NOT PERMITTED */

[2316] This is not permitted because the ++ operator has the side effectof assigning b+1 to b which requires one clock cycle. Note that even thelongest and most complex C expression with many side effects can bewritten in terms of a larger number of simpler expressions. Theresulting code is normally easier to read. For example:

[2317] a = (b++) + (((c−− ? d++ : e−−)), f);

[2318] can be rewritten as:

[2319] a = b + f;

[2320] b = b + 1;

[2321] if(c)

[2322] d = d + 1;

[2323] else

[2324] e = e − 1;

[2325] c = c − 1;

[2326] Note that Handel-C provides the prefix and postfix ++ and −−operations as statements rather than expressions. For example:

[2327] a++;

[2328] b−−;

[2329] ++c;

[2330] −−d;

[2331] This example is directly equivalent to:

[2332] a = a + 1;

[2333] b = b − 1;

[2334] c = c + 1;

[2335] d = d − 1;.

[2336] Restrictions on RAMs and ROMs

[2337] Because of their architecture, RAMs and ROMs are restricted toperforming operations sequentially. Only one element of a RAM or ROM maybe addressed in any given clock cycle and, as a result, familiar lookingstatements are often disallowed. For example:

[2338] ram unsigned int 8 x[4];

[2339] x[1] =x[3] + 1;

[2340] This code is illegal because the assignment attempts to read fromthe third element of x in the same cycle as it writes to the firstelement. Note that the ports within a multi-port RAM are in the sameelements of memory so one can only make a single access to any one mpramport in a single clock cycle. The following code is also disallowed:

[2341] ram unsigned int 8 x[4];

[2342] if (x[0]==0)

[2343] x[1] 1;

[2344] This is because the condition evaluation may read from element 0of the RAM in the same clock cycle as the assignment writes toelement 1. Similar restrictions apply to while loops, do . . . whileloops, for loops and switch statements. Note that arrays of variables donot have these restrictions but may require substantially more hardwareto implement than RAMs depending on the target architecture.

[2345] Operators

[2346] Bit Manipulation Operators

[2347] The following bit manipulation operators are provided inHandel-C: Operator Meaning << Shift left >> Shift right <− Take leastsignificant bits \\ Drop least significant bits @ Concatenate bits [ ]Bit selection

[2348] width(Expression) Width of expression

[2349] Shift Operators

[2350] The shift operators shift a value left or right by a variablenumber of bits resulting in a value of the same width as the value beingshifted. Any bits shifted outside this width are lost. When shiftingunsigned values, the right shift pads the upper bits with zeros. Whenright shifting signed values, the upper bits are copies of the top bitof the original value. Thus, a shift right by 1 divides the value by 2and preserves the sign. For example:

[2351] unsigned int 8 x;

[2352] int 8 y;

[2353] x = 192;

[2354] y = −8;

[2355] x = x >> 1;

[2356] y = y >> 1;

[2357] This results in x being set to 96 and y being set to −4.

[2358] Take Operator

[2359] The take operator, <-, returns the n least significant bits of avalue. The drop operator, \\, returns all but the n least significantbits of a value. n may be a compile-time constant. For example:

[2360] macro expr four = 8 / 2;

[2361] unsigned int 8 x;

[2362] unsigned int 4 y;

[2363] unsigned int 4 z;

[2364] x = 0xC7;

[2365] y = x <- four;

[2366] z = x \\ 4;

[2367] This results in y being set to 7 and z being set to 12 (or 0xC inhexadecimal).

[2368] Concatenation Operator

[2369] The concatenation operator, @, joins two sets of bits togetherinto a result whose width is the sum of the widths of the two operands.For example:

[2370] unsigned int 8 x;

[2371] unsigned int 4 y;

[2372] unsigned int 4 z;

[2373] y = 0xC;

[2374] z = 0x7;

[2375] x = y @ z;

[2376] This results in x being set to 0xC7. The left operand of theconcatenation operator forms the most significant bits of the result.

[2377] Bit Selection

[2378] Individual bits or a range of bits may be selected from a valueby using the [ ]operator. Bit 0 is the least significant bit and bit n−1is the most significant bit where n is the width of the value.

[2379] For example:

[2380] unsigned int 8 x;

[2381] unsigned int 1 y;

[2382] unsigned int 5 z;

[2383] x = 0b01001001;

[2384] y = x[4];

[2385] z = x[7:3];

[2386] This results in y being set to 0 and z being set to 9. Note thatthe range of bits is of the form MSB:LSB and is inclusive. Thus, therange 7:3 is 5 bits wide.

[2387] Bit selection in RAM, ROM and array elements is also possible.For example:

[2388] ram int 7 w[23];

[2389] int 5 x[4];

[2390] int 3 y;

[2391] unsigned int 1 z;

[2392] y = w[10][4:2];

[2393] z = x[2][0];.

[2394] Here, the 10 is the entry in the RAM and the 4:2 selects threebits from the middle of the value in the RAM. Similarly, z is set to theleast significant bit in the x[21 variable.

[2395] Width Operator

[2396] The width( ) operator returns the width of an expression. It is acompile time constant. For example:

[2397] x = y <- width(x);

[2398] This takes the least significant bits of y and assigns them to x.The width( ) operator ensures that the correct number of bits is takenfrom y to match the width of x.

[2399] Arithmetic Operators

[2400] The following arithmetic operators are provided in Handel-C:Operator Meaning + Addition − Subtraction * Multiplication / Division %Modulus arithmetic

[2401] Any attempt to perform one of these operations on two expressionsof differing widths or types results in a compiler error. For example:

[2402] int4w;

[2403] int3x;

[2404] int 4 y;

[2405] unsigned 4 z;

[2406] y = w + x; // ILLEGAL

[2407] z = w + y; // ILLEGAL

[2408] The first statement is illegal because w and x have differentwidths. The second statement is illegal because w and y are signedintegers and z is an unsigned integer. All operators return results ofthe same width as their operands. Thus, all overflow bits are lost. Forexample:.

[2409] unsigned int 8 x;

[2410] unsigned int 8 y;

[2411] unsigned int 8 z;

[2412] x = 128;

[2413] y = 192;

[2414] z = 2;

[2415] x = x + y;

[2416] z = z * y;

[2417] This example results in x being set to 64 and z being set to 128.By using the bit manipulation operators to expand the operands, it ispossible to obtain extra information from the arithmetic operations. Forinstance, the carry bit of an addition or the overflow bits of amultiplication may be obtained by first expanding the operands to themaximum width required to contain this extra information. For example:

[2418] unsigned int 8 u;

[2419] unsigned int 8 v;

[2420] unsigned int 9 w;

[2421] unsigned int 8 x;

[2422] unsigned int 8 y;

[2423] unsigned int 16 z;

[2424] w = (0 @ u) + (0 @ v);

[2425] z = (0 @ x) * (0 @ y);

[2426] In this example, w and z contain all the information obtainablefrom the addition and multiplication operations. Note that the constantzeros do not require a width specification because the compiler caninfer their widths form the usage. The zeros in the first assignment maybe 1 bit wide because the destination is 9 bits wide while the sourceoperands are only 8 bits wide. In the second assignment, the zeroconstants may be 8 bits wide because the destination is 16 bits widewhile the source operands are only 8 bits wide.

[2427] Operator Precedence

[2428] Precedence of operators is as expected from conventional C. Forexample:

[2429] x = x + y * z;

[2430] This performs the multiplication before the addition. Bracketsmay be used to ensure the correct calculation order as in conventionalC.

[2431] Relational Operators

[2432] The following relational operators are provided in Handel-C:Operator Meaning == Equal to != Not equal to < Less than > Greater than<= Less than or equal >= Greater than or equal

[2433] These operators compare values of the same width and return asingle bit wide unsigned int value of 0 for false or I for true. Thismeans that the following conventional C code is invalid:

[2434] int 8 w, x, y, z;

[2435] w = x + (y>z); // NOT ALLOWED

[2436] Instead, one should write:

[2437] w = x + (0@(y>z));

[2438] Signed/Unsigned Compares

[2439] Signed/signed compares and unsigned/unsigned compares are handledautomatically. Mixed signed and unsigned compares are not handledautomatically. For example:

[2440] unsigned 8 x;

[2441] int 8 y;

[2442] if (x>y) // Not allowed

[2443] . . .

[2444] To compare signed and unsigned values one may sign extend each ofthe parameters. The above code can be rewritten as:

[2445] unsigned 8 x;

[2446] int 8 y;

[2447] if ((int)(0@x) > (y[7]@y))

[2448] . . .

[2449] Implicit Compares

[2450] The Handel-C compiler inserts implicit compares with zero if avalue is used as a condition on its own. For example:

[2451] while (1)

[2452] {

[2453] . . .

[2454] }

[2455] while (1 != 0)

[2456] {

[2457] . . .

[2458] }

[2459] Logical Operators

[2460] The following logical operators are provided in Handel-C:Operator Meaning && Logical and ∥ Logical or ! Logical not

[2461] These operators are provided to combine conditions as inconventional C. Each operator takes 1-bit unsigned operands and returnsa 1-bit unsigned result. Note that the operands of these operators neednot be the results of relational operators. For example:

[2462] if (x ∥ y>z)

[2463] w=0;

[2464] In this example, the variable x need not be 1 bit wide—if it iswider, the Handel-C compiler -inserts a compare with 0. As inconventional C, the condition of the if statement is true if x is notequal to 0 or y is greater than z. This feature allows some familiarlooking conventional C constructs. For example:

[2465] while (x ν y)

[2466] {

[2467] . . .

[2468] }

[2469] Bitwise Logical Operators

[2470] The following bitwise logical operators are provided in Handel-C:Operator Meaning & Bitwise and | Bitwise or {circumflex over ( )}Bitwise exclusive or ˜ Bitwise not

[2471] these operators perform bitwise logical operations on values.Both operands may be of the same type and width: the resulting value mayalso be this type and width. For example:

[2472] unsigned int 6 w;

[2473] unsigned int 6 x;

[2474] unsigned int 6 y;

[2475] unsigned int 6 z;

[2476] w = 0b101010;

[2477] x = 0b011100;

[2478] y = w & x;

[2479] z = w | x;

[2480] w = w ^ ˜x;

[2481] This example results in y having the value 0b001000, z having thevalue 0b111110 and w having the value 0b001001.

[2482] Conditional Operator

[2483] Handel-C provides the conditional expression construct familiarfrom conventional C. Its format is: Expression ? Expression : Expression

[2484] The first expression is evaluated and if true, the wholeexpression evaluates to the result of the second expression. If thefirst expression is false, the whole expression evaluates to the resultof the third expression. For example:

[2485] x = (y > z) ? y : z;

[2486] This sets x to the maximum of y and z. This code is directlyequivalent to:

[2487] if (y > z)

[2488] x = y;

[2489] else

[2490] x = z;

[2491] The advantage of using this construct is that the result is anexpression so it can be embedded in a more complex expression. Forexample:

[2492] x = ((y > z) ? y : z) + 4;

[2493] Casting of Expression Types

[2494] The following piece of Handel-C is invalid:

[2495] int 4 x; // Range of x: −8 . . . 7

[2496] unsigned int 4 y; // Range of y: 0 . . . 15

[2497] x = y; // Not allowed

[2498] This is because x is a signed integer while y is an unsignedinteger. When generating hardware, it is not clear what the compilershould do here. It could simply assign the 4 bits of y to the 4 bits ofx or it could extend y with an extra zero as its most significant bit topreserve its value and then assign these 5 bits to x assuming x wasdeclared to be 5 bits wide. To see the difference, consider the casewhen y is 10. By simply assigning these 4 bits to a signed integer, aresult of −6 would be placed in x. A better solution might be to extendy to a five bit value by adding a 0 bit as its MSB to preserve the valueof 10. The solution adopted by Handel-C is not to allow automaticconversions between signed and unsigned values to avoid this confusion.Instead, values may be ‘cast’ between types to ensure that theprogrammer is aware that a conversion is occurring that may alter themeaning of a value. The above example then becomes:

[2499] int 4 x;

[2500] unsigned int 4 y;

[2501] x = (int 4)y;

[2502] It is now clear that the value of x is the result of treating the4 bits extracted from y as a signed integer. One can also cast to a typeof undefined width. For example:

[2503] int 4 x;

[2504] unsigned int undefined y;

[2505] x = (int undefined)y;

[2506] Here, the compiler may infer that y may be 4 bits wide. Castingcannot be used to change the width of values. For example,

[2507] this is not allowed:

[2508] unsigned int 7 x;

[2509] int 12 y;

[2510] y = (int 12)x; // Not allowed

[2511] Instead, the conversion should be done explicitly:

[2512] y = (int 12)(0 @ x);

[2513] Here, the concatenation operation produces a 12-bit unsignedvalue. The casting then changes this to a 12-bit signed integer forassignment to y. Again, this is to ensure that the programmer is awareof such conversions. To illustrate why this is important, consider thefollowing example:

[2514] int 7 x;

[2515] unsigned int 12 y;

[2516] x =−5;

[2517] y = (unsigned int 12)x;.

[2518] Here, the Handel-C compiler could take two equally viable routes.One would be to sign extend the value of x and produce the result 4091.The second would be to zero pad the value of x and produce the value of123. Since neither method can preserve the value of x in y Handel-Cperforms neither automatically. Rather, it is left up to the programmerto decide which approach is correct in a particular situation and towrite the expression accordingly

[2519] Functions

[2520] Introduction

[2521] Functions are similar to functions in ISO-C. Handel-C has beenextended to provide arrays of functions and inline functions. Arrays offunctions provide multiple copies of a function. One can select whichcopy is used at any time. Inline functions are similar to macros in thatthey are expanded wherever they are used. Functions take arguments andreturn values. A function that does not return a value is of type void.The default return type is int undefined.

[2522] When a function is declared or defined, it has a parameter list,which describes the type of arguments that it expects to receive.Functions that do not take arguments have void as their parameter list.E.g. void main(void)

[2523] As in ISO-C, function arguments are passed by value. This meansthat a local copy is created that is only in scope within the function.Changes take place on this copy. To access a variable outside thefunction, one may pass the function a pointer to that variable. A localcopy may be made of the pointer, but it may still point to the samevariable. This is known as passing by reference. Architectural types(hardware constructs) may be passed by reference (a pointer to oraddress of the construct). The only architectural type that can bepassed to or returned by a function by value is a signal. All others(and structures or unions containing them) may be passed by reference.Arrays and functions can also only be passed by reference.

[2524] Function Definitions and Declarations

[2525] Functions are defined as in ISO-C. The function declarationconsists of the function name, and names and types for its parametersand return value. The definition of a function consists of itsdeclaration plus the code body that it performs when it is called.returnType Name(parameterList) { declarations statements }

[2526] If the declaration is followed by a semi-colon, it is a functionprototype. This tells the compiler the types of arguments that thefunction expects so it can check that the function is used correctlywithin the rest of the file.

[2527] returnType Name(parameterList);

[2528] The names in a function prototype are only in scope in theprototype. One can use different names in the definition of the functionand function calls. Functions may be declared (prototyped) in every filethat they are used in, though they should only be defined once. It iscommon to put function prototypes into a header file and #include thatin every file where they are used.

[2529] Scope

[2530] Functions cannot be defined within other functions. By default,functions are extern (they can be used anywhere). Functions can also bedefined as static (they can only be used in the file in which they aredefined).

[2531] Arrays of Functions

[2532] An array of functions is a collection of identical functions. Itis not the same as an array of function pointers (each of whose elementscan point to a different function). Function arrays allow functions tobe copied and shared neatly. Here is a declaration of a simple functionarray: unsigned func[2](unsigned x, unsigned y) { return(x + y); }

[2533] The syntax is a normal function declaration, with square bracketsadded to specify that this is an array declaration as well as a functiondeclaration. The general form of a function array declaration is:

[2534] return Type Name [Size] (parameterList)

[2535] One can also declare a function array in a prototype. This meansthat one can declare a function func in one file, and an array offunctions of type fune in another file

[2536] void func[n](void);

[2537] A function array allows one to run different copies of thefunction in parallel. Without this construct, the only safe way to run afunction in parallel with itself would be to explicitly declare twofunctions with different names. This would not be so neat and intuitive.

[2538] Example set clock = external “P1”; // Function array prototypeunsigned func[2] (unsigned x, unsigned y); // Main program voidmain(void) { unsigned a, b, c, d, e, f; unsigned 1 (*chk) (short int *,short int *)); unsigned 1 addeven(const short int *x, const short int*y); unsigned 1 multeven(const short int *x, const short int *y);unsigned 1 diveven(const short int *x, const short int *y); unsigned 1modeven(const short int *x, const short int *y); void main(void) { shortint m, n; unsigned 2 choice; unsigned 1 result; unsigned 1 (*p) (constshort *, const short *); par { m = 19; n = 47; } do { switch (choice) {case 0: p = addeven; break; case 1: p = multeven; break; case 2: p =diveven; break; case 3: p = modeven; break; default: break; } par {result = check(&m, &n, p); choice++; }.Handel-C Language } while(choice){grave over ( )} delay; } unsigned 1 check(short int *a, short int *b,unsigned 1 (*chk) (short int *, short int *)) { return (*chk) (a, b); }unsigned 1 adeven(const short int *x, const short int *y) { return(unsigned) (*x + *y) [0]; } unsigned 1 multeven(const short int *x,const short int *y) { return (unsigned) (*x * *y) [0]; } unsigned 1diveven(const short int *x, const short int *y) { return (unsigned) (*x/ *y) [0]; } unsigned 1 modeven(const short int *x, const short int *y){ return (unsigned) (*x % *y) [0];

[2539] Function Pointers

[2540] These are a very powerful, yet potentially confusing feature. Insituations where any one of a number of functions can be called at aparticular point, it is neater and more concise to use a functionpointer, where the alternative might be a long if-else chain, or a longswitch statement. For example, consider this program:

[2541] unsigned 1 check(short int *a, short int *b,

[2542] The function addeven checks whether the sum of two numbers iseven. Similar checks are carried out by multeven (product of twonumbers), diveven (division) and modeven (modulus). The function checksimply calls the function whose pointer it receives, with the argumentsit receives. This gives a consistent interface to the xxxeven functions.Pay close attention to the declaration of check, and of function pointerp. The parentheses around *p (and *chk in the declaration of check) arenecessary for the compiler to make the correct interpretation.

[2543] Indirection Techniques

[2544] Function pointers can be assigned with or without the addressoperator & (similar to assigning array addresses). Functions pointed tocan be called with or without the indirection operator. In the codeabove, the function name was assigned to the pointer without the &

[2545] p = addeven;

[2546] One may wish to use the & format for clarity:

[2547] p = &adeven;

[2548] Inside check, the function pointed to by p was called by writing.

[2549] (*chk)(a, b);

[2550] This could also have written in the shorthand form:

[2551] chk(a, b);

[2552] The first form is preferable, as it tips off anyone reading thecode that a function pointer is being used.

[2553] Inside the main program body, check was called like this.

[2554] check(&m, &n, p);

[2555] It could have been written like this,

[2556] check(&m, &n, xxxeven);

[2557] eliminating the need for an additional pointer variable. Here isthe main section written using this form of expression: void main(void){ short int m, n; unsigned 2 choice; unsigned 1 result; par { m = 19; n= 47; } do { switch (choice) case 0: result = check(&m, &n, adeven);break; case 1: result = check(&m, &n, multeven); break; case 2: result =check(&m, &n, diveven); break; case 3: result = check(&m, &n, modeven);break; default: break; choice++; } while(choice) delay;

[2558] Restrictions on Functions

[2559] Shared Code

[2560] Functions may not be shared by two different parts of the programon the same clock cycle. For example: int func(x, y); par { a = func(b,c); { b = foo; d = func(e, f); // NOT ALLOWED } } int func(int x, int y){ if (x = =y) delay; else { x = x % y; } x*=10; return(x) }

[2561] This is not allowed because part of the single function is usedtwice in the same clock cycle. This overlapping usage is not detected bythe compiler, as it is a run-time error. It is therefore theprogrammer's responsibility to ensure that code usage does not overlap.This may be done by declaring functions to be inline (are expandedwhenever they are used) or declaring an array of functions, one to beused in each parallel branch. inline int func(x, y); par { a = func(b,c); { b = foo; d = func(e, f); } } or int func[3] (x, y); par { a =func[0] (b, c); { b = foo; d = func[1] (e, f); } }

[2562] More details on timing of Handel-C programs and more details ofhow one can tell which clock cycle operations are performed will be setforth later.

[2563] Recursion

[2564] Due to the absence of a stack in Handel-C, functions cannot berecursive. If a person calls a function within that function's body, thecompiler generates an error

[2565] Macros

[2566] Introduction

[2567] As mentioned in previous sections, the Handel-C compiler passessource code through a standard C preprocessor before compilationallowing the use of #define to define constants and macros in the usualmanner. There are some limitations to this approach. Since thepreprocessor can only perform textual substitution, some useful macroconstructs cannot be expressed. For example, there is no way to createrecursive macros using the preprocessor.

[2568] Handel-C provides additional macro support to allow more powerfulmacros to be defined (for example, recursive macro expressions). Inaddition, Handel-C supports shared macros to generate one piece ofhardware which is shared by a number of parts of the overall programsimilar to the way that procedures allow conventional C to share onepiece of code between many parts of a conventional program. This sectionof the present description details how to define macros and sharedhardware.

[2569] Macro Expressions

[2570] Macros may be used to replace expressions to avoid tediousrepetition. Handel-C provides some powerful macro constructs to allowcomplex expressions to be generated simply.

[2571] Constant Macro Expressions

[2572] Constant macro expressions are of two types:

[2573] .simple constant equivalent to #define

[2574] .a constant expression

[2575] Constant

[2576] This first form of the macro is a simple expression. For example:

[2577] macro expr DATA_WIDTH = 15;

[2578] int DATA_WIDTH x;

[2579] This form of the macro is similar to the #define macro. WheneverDATA_WIDTH appears in the program, the constant 15 is inserted in itsplace.

[2580] Constant Expression

[2581] To provide a more general solution, one can use a realexpression. For example:

[2582] macro expr sum = (x + y) @ (y + z);

[2583] v = sum;

[2584] w = sum;

[2585] Parameterized Macro Expressions

[2586]FIG. 57A-2 illustrates a method 5750 for parameterizedexpressions, in accordance with various embodiments of the presentinvention. In general, a plurality of first variables are defined withreference to variable widths. See operation 5752. A plurality of secondvariables are also defined without reference to variable widths, asindicated in operation 5754. In an aspect of the present invention, thefirst and second variables may be included in a library.

[2587] Computer code is then compiled including the first and secondvariables. Note operation 5756. As such, the variable widths of thesecond variables may be inferred from the variable widths of the firstvariables. See operation 5758. In one embodiment of the presentinvention, the variable widths of the second variables may be inferredduring a routine that reconciles the first variables with the secondvariables in the library. As an option, a relation may be definedbetween the first variables and the second variables.

[2588] In yet another aspect, the first variables may be further definedwith reference to data types, the second variables may be definedwithout reference to the data types, and the data types of the secondvariables may be inferred from the data types of the first variables. Ineven another aspect of the present invention, the first variables may befurther defined with reference to array size, the second variables maybe defined without reference to the array size, and the array size ofthe second variables may be inferred from the array size of the firstvariables. In yet another aspect, the first variables may be furtherdefined with reference to pipeline depth, the second variables may bedefined without reference to the pipeline depth, and the pipeline depthof the second variables may be inferred from the pipeline depth of thefirst variables.

[2589] It should be noted that the above concept may be applied in moregeneral contexts per the desires of the user. For example, anapplication may be defined with a first variable where the firstvariables' width is unresolved. Thereafter, the application may bestored in a library, and computer code may be compiled including thefirst variable. In one embodiment of the present invention, a pluralityof libraries may be used to organize functional components of predefinedfunctions.

[2590] As such, the variable width of the first variable may be resolvedas the application is utilized in any desired manner. For example, thevariable width of the first variable may be resolved utilizingpredefined rules during compilation. Still yet, a plurality of variablesmay be resolved dynamically during compilation. As yet another option,the variable widths of the first variable may change in response to thecompilation in a first application or a second application.

[2591] As an option, the first variable may be defined with no referenceto a data type. Accordingly, the data type of the first variable isresolved dynamically as the compilation proceeds. In a similar manner,the first variable may be defined without reference to array size.Further, the array size of the second variables may be resolveddynamically during compilation as the first variable is used by anapplication.

[2592] More information regarding the above concept will now be setforth in greater detail. Handel-C also allows macros with parameters.For example:

[2593] macro expr add3(x) = x+3;

[2594] y = add3(z);

[2595] This is equivalent to the following code:

[2596] y = z + 3;

[2597] Again, this form of the macro is similar to the #define macro inthat every time the add3O macro is referenced, it is expanded in themanner shown above. In other words, in this example, an adder isgenerated in hardware every time the add3( ) macro is used.

[2598] The Select Operator

[2599] Handel-C provides a select(. . . ) operator which is used to mean‘select at compile time’. Its general usage is: select(Expression,Expression, Expression) Here, the first expression may be a compile timeconstant. If the first expression evaluates to true then the Handel-Ccompiler replaces the whole expression with the second expression. Ifthe first expression evaluates to false then the Handel-C compilerreplaces the whole expression with the second expression. The differencebetween this and the ? : operators is best illustrated with an example.

[2600] w = (width(x)==4 ? y : z);

[2601] This example generates hardware to compare the width of thevariable x with 4 and set w to the value of y or z depending on whetherthis value is equal to 4 or not. This is probably not what was intendedin this case because both width(x) and 4 are constants. What wasprobably intended was for the compiler to check whether the width of xwas 4 and then simply replace the whole expression above with y or zaccording to the value. This can be written as follows:

[2602] w = select(width(x)==4, y, z);

[2603] In this example, the compiler evaluates the first expression andreplaces the whole line with either w=y; or w=z;. No hardware for theconditional is generated.

[2604] A more useful example can be seen when macros are combined withthis feature. For example:

[2605] macro expr adjust(x, n) =

[2606] select(width(x) < n, (0 @ x), (x <- n));

[2607] unsigned 4 a;

[2608] unsigned 5 b;

[2609] unsigned 6 c;

[2610] b = adjust(a, width(b));

[2611] b = adjust(c, width(b));

[2612] This example is for a macro that equalizes widths of variables inan assignment. If the right hand side of an assignment is narrower thanthe left hand side then the right hand side may be padded with zeros inits most significant bits. If the right hand side is wider than the lefthand side, the least significant bits of the right hand side may betaken and assigned to the left hand side.

[2613] The select(. . . ) operator is used here to tell the compiler togenerate different expressions depending on the width of one of theparameters to the macro. The last two lines of the example could havebeen written by hand as follows:

[2614] b =0 @ a;

[2615] b =c <- 5;

[2616]

[2617] However, the macro comes into its own if the width of one of thevariables changes. For example, suppose that during debugging, it isdiscovered that the variable a is not wide enough and needs to be 8 bitswide to hold some values used during the calculation. By using themacro, the only change required would be to alter the declaration of thevariable a. The compiler would then replace the statement b = 0 @ a;with b = a <- 5; automatically.

[2618] This form of macro also comes in useful is when variables ofundefined width are used. If the compiler is used to infer widths ofvariables, it may be tedious to work out by hand which form of theassignment is required. By using the select(. . . ) operator in thisway, the correct expression is generated without one having to know thewidths of variables at any stage.

[2619] Ifselect

[2620] Syntax

[2621] ifselect (Condition)

[2622] Statement 1

[2623] [else

[2624] Statement 2]

[2625] ifselect checks the result of a compile-time constant expressionat compile time. If the condition is true, the following statement orcode block is compiled. If false, it is dropped and an else conditioncan be compiled if it exists. Thus, whole statements can be selected ordiscarded at compile time, depending on the evaluation of theexpression.

[2626] The ifselect construct allows one to build recursive macros, in asimilar way to select. It is also useful inside replicated blocks ofcode as the replicator index is a compile-time constant. Hence, one canuse ifselect to detect the first and last items in a replicated block ofcode and build pipelines.

[2627] Example

[2628] int 12 a;

[2629] int 13 b;

[2630] int undefined c;

[2631] ifselect(width(a) >= width(b))

[2632] c = a;

[2633] else

[2634] c = b;

[2635] c is assigned to by either a or b, depending on their widthrelationship.

[2636] Pipeline Example

[2637] unsigned init;

[2638] unsigned q[15];

[2639] unsigned 31 out;

[2640] init = 57;

[2641] par (r = 0; r < 16; r++)

[2642] {

[2643] ifselect(r == 0)

[2644] q[r] =init;

[2645] else ifselect(r == 15)

[2646] out = q[r−1];

[2647] else

[2648] q[r] = q[r−1];

[2649] }

[2650] Recursive Macro Expressions

[2651] A serious limitation with preprocessor macros (those defined with#define) is their inability to generate recursive expressions. Bycombining Handel-C macros (those defined with macro expr) and theselect(. . . ) operator discussed above, recursive macros can be used tosimply express complex hardware. This type of macro is particularlyimportant in Handel-C where the exact form of the macro may depend onthe width of a parameter to the macro. As an example, a sign extensionof a variable is taken. When assigning a narrow signed variable to awider variable, the most significant bits of the wide variable should bepadded with the sign bit (MSB) of the narrow variable. For example, the4-bit representation of −2 is 0b1110. When assigned to an 8-bit widevariable, this should become 0b11111110. In contrast, the 4-bitrepresentation of 6 is 0b0110. When assigned to an 8-bit wide variable,this should become 0b00000110.

[2652] In this example, the following code would suffice:

[2653] int 8 x;

[2654] int 4 y;

[2655] x = y[3] @ y[3] @ y[3] @ y[3] @ y;

[2656] As one can see, this can rapidly become tedious for variablesthat differ by a significant number of bits. Also, what if the exactwidths of the variables are not known? What is needed is a macro to signextend a variable. For example:

[2657] macro expr copy(x, n) =

[2658] select(n==1, x, (x @ copy(x, n−1)));

[2659] macro expr extend(y, m) =

[2660] copy(y[width(y)−1], m-width(y)) @ y;

[2661] int a;

[2662] int b; // Where b is known to be wider than a

[2663] b = extend(a, width(b));

[2664] Here, the copy macro generates n copies of the expression xconcatenated together. The macro is recursive and uses the select(. . .) operator to evaluate whether it is on its last iteration (in whichcase it just evaluates to the expression) or whether it should continueto recurse by a further level. The extend macro simply concatenates thesign bit of its parameter m-k times onto the most significant bits ofthe parameter. Here, m is the required width of the expression y and kis the actual width of the expression y. The final assignment correctlysign extends a to the width of b for any variable widths where width(b)is greater than width(a).

[2665] Recursive Macro Expressions: A Larger Example

[2666] A second example of the use of recursive macro expressions is nowgiven to illustrate the generation of large quantities of hardware fromsimple macros. The example used is that of a multiplier whose widthdepends on the parameters of the macro. Although Handel-C includes amultiplication operator as part of the language, this example serves asa starting point for generating large regular hardware structures usingmacros.

[2667] The multiplier generates the hardware for a single cycle longmultiplication operation from a single macro. The source code is:

[2668] macro expr multiply(x, y) =

[2669] select(width(x) ==0, 0,

[2670] multiply(x \\ 1, y << 1) +

[2671] (x[0]==1 ? y: 0));

[2672] a = multiply (b , c);

[2673] At each stage of recursion, the multiplier tests whether thebottom bit of the x parameter is 1. If it is then y is added to the‘running total’. The multiplier then recurses by dropping the LSB of xand multiplying y by 2 until there are no bits left in x. The overallresult is an expression that is the sum of each bit in x multiplied byy. This is the familiar long multiplication structure. For example, ifboth parameters are 4 bits wide, the macro expands to:

[2674] a = ((b \\ 3)[0]==1 ? c<<3 : 0) +

[2675] ((b \\ 2)[0]==1 ? c<<2 : 0) +

[2676] ((b \\ 1)[0]==1 ? c<<1 : 0) +

[2677] (b[0]==1 ? c 0);

[2678] This code is equivalent to:

[2679] a = ((b & 8)==8 ? c*8 : 0) +

[2680] ((b & 4)==4 ? c*4 : 0) +

[2681] ((b & 2)==2 ? c*2 : 0) +

[2682] ((b & 1)==1 ?c : 0);

[2683] which is a standard long multiplication calculation.

[2684] Shared Expressions

[2685] By default, Handel-C generates all the hardware required forevery expression in the whole program. In many programs, this means thatlarge parts of the hardware may be idle for long periods. The sharedexpression allows hardware to be shared between different parts of theprogram to decrease hardware usage. The shared expression has the sameformat as a macro expression but does not allow recursion. An exampleprogram where shared expressions are extremely useful is:

[2686] a = b * c;

[2687] d = e * f;

[2688] g = h * i;

[2689] Here, three multipliers may be generated but each one may only beused once and none of them simultaneously. This is a massive waste ofhardware. The way to improve this program is:

[2690] shared expr mult(x, y) = x * y;

[2691] a = mult(b, c);

[2692] d = mult(e, f);

[2693] g = mult(h, i);

[2694] In this example, only one multiplier is built and it is used onevery clock cycle which is a better use of hardware. (In fact, the aboveexample could be built as three multipliers executing in parallel if themaximum performance is required).

[2695] It is not always the case that less hardware is generated byusing shared expressions because multiplexers may need to be built toroute the data paths. Some expressions use less hardware than themultiplexers associated with the shared expression.

[2696] Using Recursion to Generate Shared Expressions

[2697] Although shared expressions cannot use recursion directly, macroexpressions can be used to generate hardware which can then be sharedusing a shared expression. For example, to share the recursivemultiplier macro example above one could write:

[2698] macro expr multiply(x, y) =

[2699] select(width(x) == 0, 0,

[2700] multiply(x \\ 1, y << 1) +

[2701] (x[0] == 1 ? y : 0));

[2702] shared expr mult(x, y) =multiply(x, y);

[2703] a = mult(b, c);

[2704] d = mult(e, f);

[2705] Here, the macro expression builds a multiplier and the sharedexpression allows that hardware to be shared between the twoassignments.

[2706] Restrictions on Shared Expressions

[2707] A limitation to shared expressions is that they may not be sharedby two different parts of the program on the same clock cycle. Forexample:

[2708] shared expr mult(x, y) = x * y;

[2709] par

[2710] a = mult(b, c);

[2711] d = mult(e, f); // NOT ALLOWED

[2712] }

[2713] This is not allowed because the single multiplier is used twicein the same clock cycle. This becomes an important skill when usingshared expressions.

[2714] let . . . in

[2715] The Handel-C constructs let and in allow one to declare macroexpressions within macro expressions. In this way, complex macros may bebroken down into simple ones, whilst still being grouped together in asingle block of code. They also provide easy sharing of recursivemacros. The let keyword starts the declaration of a local macro; the inkeyword ends the declaration and defines its scope.

[2716] Example

[2717] macro expr Fred(x) =

[2718] let macro expr y = x*2; in

[2719] y+3; /// Returns x*2+3

[2720] The top line defines the macro name and parameters. The secondline defines y within the macro definition. The last line expresses thevalue of the macro in full.

[2721] Independent Let . . . In Definitions

[2722] macro expr op(a, b) =

[2723] let macro expr t2(x) = x * 2; in

[2724] let macro expr d3(x) x / 3; in

[2725] let macro expr t4(x) = x * 4; in

[2726] t2(a) + d3(b) + t4(a − b) + t2(b − a);

[2727] is equivalent to writing

[2728] macro expr op(a, b) = (a * 2) + (b / 3) + ((a−b) * 4) +

[2729] ((b−a) * 2);

[2730] Related Let . . . In Definitions

[2731] macro expr op(a, b) =

[2732] let macro expr sum(x, y) = x + y; in

[2733] let macro expr mult(x, y) = x * sum(x, y); in

[2734] mult(a, b) − (b * b);

[2735] sum is defined within the macro definition, then mult is definedusing

[2736] sum. This example is equivalent to:

[2737] macro expr op(a, b) = (a * (a + b)) − (b * b);

[2738] Shared Recursive Macro

[2739] A recursive multiplier illustrating the way in which let . . . incan be used to share recursive macros.

[2740] shared expr mult(p, q) =

[2741] let macro expr multiply(x, y) =

[2742] select(width(x) ==0, 0, multiply(x \\ 1, y << 1)

[2743] +(x[0] == 1 ? y : 0)); in

[2744] multiply(p, q);.

[2745] Macro Procedures

[2746] Macros may be used to replace statements to avoid tediousrepetition. Handel-C provides simple macro constructs to expand singlestatements into complex blocks of code. The general syntax of macroprocedures is:

[2747] macro proc Name(Params) Statement

[2748] For example: macro proc output(x, y) { out ! x; out ! y; }output(a + b, c * d); output(a + b, c * d);

[2749] This example writes the two expressions a+b and c*d twice to thechannel out. This example also illustrates that the statement may be acode block—in this case two instructions executed sequentially. Macroprocedures generate the hardware for their statement every time they arereferenced. The above example expands to 4 channel output statements.Macro procedures differ from preprocessor macros in that they are notsimple text replacements. The statement section of the definition may bea valid Handel-C statement. For example:

[2750] #define test(x,y) if (x!=(y<<2))

[2751] test(a,b)

[2752] {

[2753] a++;

[2754] }

[2755] else

[2756] {

[2757] b++;

[2758] }

[2759] This is a valid preprocessor macro definition. However, thefollowing code is not allowed:

[2760] macro proc test(x,y) if (x!=(y<<2));

[2761] test(a,b) // NOT ALLOWED

[2762] {

[2763] a++;

[2764] }

[2765] else

[2766] {

[2767] b++;

[2768] }

[2769] Here, the macro procedure is not defined to be a completestatement so the Handel-C compiler generates an error. This restrictionprovides protection against writing code such as these examples which isgenerally unreadable and difficult to maintain.

[2770] Macro Prototypes

[2771] As with functions, macros may be prototyped. This allows one todeclare them in one file and use them in another. A macro prototypeconsists of the name of the macro plus a list of the names of itsparameters. E.g.

[2772] macro proc work(x, y);

[2773] shared expr mult(p, q);.10 Timing and efficiency information.

[2774] Timing Information

[2775] Introduction

[2776] A Handel-C program executes with one clock source for each mainstatement. It is important to be aware exactly which parts of the codeexecute on which clock cycles. This is not only important for writingcode that executes in fewer clock cycles but may mean the differencebetween correct and incorrect code when using Handel-C's parallelism.Knowing about clock cycles also becomes important when consideringinterfaces to external hardware. This subject is covered in greaterdetail later but it is important to understand timing issues beforemoving on to implementing such interfaces because it likely that theexternal device may place constraints on when signals should change.

[2777] This section of the present description also deals with thesubject of overall performance. It shall be seen that avoiding certainconstructs has a dramatic influence on the maximum clock rate that theHandel-C program can run at and some guidelines are given for improvingthe hardware performance. An example is given that covers theconsiderations for real time constraints on a system.

[2778] Clock Cycle Timing of Language Constructs

[2779] This section deals with the analysis of a program in terms of thenumber of clock cycles it takes to execute. The Handel-C language hasbeen designed so that an experienced programmer can immediately tellwhich instructions execute on which clock cycles. This informationbecomes very important when the program contains multiple interactingparallel processes.

[2780] Statement Timing

[2781] The basic rule for working out the number of cycles used in aHandel-C program is:

[2782] Assignment and delay take 1 clock cycle.

[2783] Everything else is free.

[2784] What this means is that every time one write an assignmentstatement or a delay statement, one use one clock cycle but one canwrite any other piece of code and not use any clock cycles to executeit. The only exception is channel communication which takes one clockcycle only if both parties are ready to communicate in the same clockdomain. This means that if one parallel branch is ready to output on achannel when another branch is ready to receive then it takes one clockcycle for the data to be assigned to the variable in the inputstatement. If one of the branches is not ready for the data transferthen execution of the other branch waits until both branches becomeready. Even if both branches are ready for the transfer then one clockcycle is used because channel input is a form of assignment. Some simpleexamples with their timings are shown below.

[2785] Statements

[2786] x = y;

[2787] x = (((y * z) + (w * v))<<2)<-7;

[2788] Each of these statements takes one clock cycle. Notice that eventhe most complex expression can be evaluated in a single clock cycle.Handel-C simply builds the combinatorial hardware to evaluate suchexpressions; they do not need to be broken down into simpler assemblyinstructions as would be the case for conventional C.

[2789] Parallel Statements

[2790] par

[2791] {

[2792] x = y;

[2793] a = b * c;

[2794] This code executes in a single cycle because each branch of theparallel statement takes only one clock cycle. This example illustratesthe benefits of parallelism. One can have as many non-interdependentinstructions as he or she wishes in the branches of a parallelstatement. The total time for execution is the length of time that thelongest branch takes to execute. For example:

[2795] par

[2796] {

[2797] x = y;

[2798] {

[2799] a = b;

[2800] c = d;

[2801] }

[2802] }

[2803] This code takes two clock cycles to execute. On the first cycle,x = y and a = b take place. On the second clock cycle, c = d takesplace. Since both branches of the par statement may complete before thepar block can complete, the first branch delays for one clock cyclewhile the second instruction in the second branch is executed.

[2804] While loop

[2805] x = 5;

[2806] while (x>0)

[2807] {

[2808] x−−;

[2809] }

[2810] This code takes a total of 6 clock cycles to execute. One cycleis taken by the assignment of 5 to x. Each iteration of the while looptakes 1 clock cycle for the assignment of x−1 to x and the loop body isexecuted 5 times. The condition of the while loop takes no clock cyclesas no assignment is involved.

[2811] For loop

[2812] for (x = 0; x < 5; x++)

[2813] {

[2814] a += b;

[2815] As discussed earlier, this code has an almost direct equivalent:

[2816] {

[2817] x = 0;

[2818] while (x<5)

[2819] {

[2820] a += b;

[2821] b *= 2;

[2822] x ++;

[2823] }

[2824] This code takes 16 clock cycles to execute. One is required forthe initialisation of x and three for each execution of the body. Sincethe body is executed 5 times, this gives a total of 16 clock cycles.

[2825] Decision

[2826] if (a>b)

[2827] {

[2828] x = a;

[2829] }

[2830] else

[2831] }

[2832] x = b;

[2833] }

[2834] This code takes exactly one clock cycle to execute. Only one ofthe branches of the if statement is executed, either x = a or x = b.Each of these assignments takes one clock cycle. Notice again that notime is taken for the test because no assignment is made. A slightlydifferent example is:

[2835] if (a>b)

[2836] {

[2837] x = a;

[2838] }

[2839] Here, if a is not greater than b, there is no else branch. Thiscode therefore takes either 1 clock cycle if a is greater than b or noclock cycles if a is not greater than b.

[2840] Channels

[2841] Channel communications are more complex. The simplest example is:

[2842] par

[2843] {

[2844] link ! x; // Transmit

[2845] link ? y; // Receive

[2846] }

[2847] This code takes a single clock cycle to execute because both thetransmitting and receiving branches are ready to transfer at the sametime. All that is required is the assignment of x to y which, like allassignments, takes 1 clock cycle. A more complex example is:

[2848] par

[2849] {

[2850] { // Parallel branch 1

[2851] a = b;

[2852] c = d;

[2853] link ! x;

[2854] }

[2855] link ? y; // Parallel branch 2

[2856] Here, the first branch of the par statement takes three clockcycles to execute. However, the second branch of the par statement alsotakes three clock cycles to execute because it may wait for two cyclesbefore the transmitting branch is ready. The usage of clock cycles is asfollows: Cycle Branch 1 Branch 2 1 a = b; delay 2 c = d; delay 3 Channeloutput Channel input

[2857] This approach extends to all the other Handel-C statements. FIGS.58A and 58B illustrate a summary 5800 of statement timings, inaccordance with one embodiment of the present invention.

[2858] Avoiding Combinatorial Loops

[2859] Consider the following example:

[2860] while (x!=3); // WARNING!!

[2861] If x is modified in a parallel process then this loop should waitfor x to become 3 before allowing the program to continue. However, thiscode is not allowed in Handel-C because it generates a combinatorialloop in the logic because of the way that Handel-C expressions are builtto evaluate in zero clock cycles. This is easier to see if one writes itas:

[2862] while (x!=3)

[2863] {

[2864] // wait until x == 3

[2865] }

[2866] This loop may be broken by changing the code to:

[2867] while (x!=3)

[2868] {

[2869] delay;

[2870] This loop takes no longer to execute than the other but does notcontain a combinatorial loop because of the clock cycle delay in theloop body. The Handel-C compiler may spots this form of error, insertthe delay statement, and generate a warning. It is considered betterpractice to include the delay statement in the code to make it explicit.Beware of code which may look correct but has the same error. Forexample:

[2871] while (x!=3)

[2872] {

[2873] if (y>z)

[2874] {

[2875] a++;

[2876] }

[2877] }

[2878] As seen above, this if statement may take zero clock cycles toexecute if y is not greater than z so even though this loop body doesnot look empty a combinatorial loop is still generated. Again, this ismore obvious written as

[2879] while (x!=3)

[2880] {

[2881] if (y>z)

[2882] {

[2883] a++;

[2884] }

[2885] else

[2886] {

[2887] // do nothing

[2888] }

[2889] }

[2890] The solution in this example is to add the else part of the ifconstruct as follows:

[2891] while (x!=3)

[2892] {

[2893] if (y>z)

[2894] {

[2895] a++;

[2896] }

[2897] else

[2898] {

[2899] delay;

[2900] }

[2901] }

[2902] Similar problems occur with do . . . while loops and switchstatements in similar circumstances. In addition, for loops with noiteration step can cause combinatorial loops.

[2903] Parallel Access to Variables

[2904] As discussed earlier, Scope and variable sharing, the rules ofparallelism state that the same variable may not be accessed from twoseparate parallel branches. This rule is there to avoid resourceconflicts on the variables. However, if care is taken then this rule maybe relaxed to state that the same variable may not be assigned to morethan once on the same clock cycle but may be read as many times asrequired. The analysis presented in this section of the presentdescription allows the programmer to determine precisely when anassignment may take place and avoid such conflicts.

[2905] This relaxation allows some useful and powerful programmingtechniques. For example:

[2906] par

[2907] {

[2908] a = b;

[2909] b = a;

[2910] }

[2911] This code swaps the values of a and b in a single clock cycle.Since exact execution time may be run-time dependant, the Handel-Ccompiler cannot determine when two assignments are made to the samevariable on the same clock cycle. One should therefore check the code toensure that the relaxed rule of parallelism is still obeyed. Using thistechnique, a four place queue can be written:

[2912] while(1)

[2913] {

[2914] par

[2915] {

[2916] int x[3];

[2917] x[0] = in;

[2918] x[1] = x[0];

[2919] x[2] = x[1];

[2920] out = x[2];

[2921] }

[2922] }

[2923] Here, the value of out is the value of in delayed by 4 clockcycles. On each clock cycle, values may move one place through the xarray. FIG. 59 illustrates various I/O 5900 based on clock cycles, inaccordance with one embodiment of the present invention.

[2924] Multiple Simultaneous use of RAMs and ROMs

[2925] Beware of the following code:

[2926] x = y>z ? RamA[1]: RamA[2];

[2927] This code does not execute correctly because of the multiple useof the RAM in the expression. The solution is to re-write the code asfollows:

[2928] x = RamA[y>z ? 1 : 2];

[2929] Here, there is only a single access to the RAM so the problemdoes not occur.

[2930] Detailed Timing Example

[2931] Here is an analyzed example that generates signals tied toreal-world constraints. The example used is the generation of signalsfor a real time clock. The signals required are for microseconds,seconds, minutes and hours. The hardware generated may eventually bedriven from an external clock. In order to write the program, the rateof this clock may be known so a value of 5 MHz is assumed. The Handel-Cprogram is shown below.

[2932] The loop body takes one clock cycle to execute. The Countvariable is used to divide the clock by 5 to generate microsecondincrements. As each variable wraps round to zero, the next time step upis incremented. void main(void) { unsigned 20 MicroSeconds; unsigned 6Seconds; unsigned 6 Minutes;. unsigned 16 Hours; unsigned 3 Count; par {Count = 0; MicroSeconds = 0; Seconds = 0; Minutes = 0; Hours = 0; }while (1) { if (Count!=4) Count++; else par { Count = 0; if(MicroSeconds!=999999) MicroSeconds++; else par {

[2933] Time Efficiency of Handel-C Hardware

[2934] Handel-C requires that the clock period for a program is longerthan the longest path through combinatorial logic in the whole program.This means that, for example, once FPGA place and route has beencompleted, the maximum clock rate for the system can be calculated fromthe reciprocal of the longest path delay in the circuit. For example,suppose the FPGA place and route tools calculate that the longest pathdelay between flip-flops in a design is 70ns. The maximum clock ratethat that circuit should be run at is then {fraction (1/70)} ns=14.3MHz. But what if this calculated rate is not fast enough for the systemperformance or real time constraints? This section deals withoptimizations that can be made to the program to reduce the longest pathdelay and increase the maximum possible clock rate.

[2935] Reducing Logic Depth

[2936] When designing Handel-C programs, it is important to rememberwhich operations combine to produce deep logic. Deep logic results inlong path delays in the final circuit so reducing logic depth shouldhelp to increase clock speed. Some guidelines will now be given forreducing logic depth.

[2937] 1. Division and modulus operators produce the deepest logic.Multiplication also produces deep logic. A single cycle divide, mod ormultiplier produces a large amount of hardware and long delays throughdeep logic so one should avoid using them wherever possible.

[2938] 2. Most common division and multiplications can be done with theshift operators. Also consider using a long multiplication with a loop,shift and add routine or a pipelined multiplier.

[2939] 3. Most common modulus operations can be done with the ANDoperator.

[2940] 4. Wide adders require deep logic for the carry ripple. Considerusing more clock cycles with shorter adders. For example, to reduce asingle, 8-bit wide adder to 3, narrower adders:

[2941] unsigned 8 x;

[2942] unsigned 8 y;

[2943] unsigned 5 temp1;

[2944] unsigned 4 temp2;

[2945] par

[2946] {

[2947] temp1 =(0@(x<-4)) + (0@(y<-4));

[2948] temp2 =(x \\ 4) + (y \\ 4);

[2949] }

[2950] x = (temp2+(0@temp1[4])) @ temp1[3:0];

[2951] 5. Avoid greater than and less than comparisons—they produce deeplogic. For example:

[2952] while (x<y)

[2953] {

[2954] . . .

[2955] x++;

[2956] }

[2957] can be replaced with:

[2958] while (x !=y)

[2959] {

[2960] . . .

[2961] x++;

[2962] }

[2963] The == and != comparisons produce much shallower logic althoughin some cases it is possible to remove the comparison altogether.Consider the following code:

[2964] unsigned 8 x;

[2965] x= 0;

[2966] do

[2967] {

[2968] . . .

[2969] x = x + 1;

[2970] }while (x != 0);

[2971] This code iterates the loop body 256 times but it can bere-written as follows:

[2972] unsigned 9 x;

[2973] x = 0;

[2974] do

[2975] {

[2976] . . .

[2977] x = x+ 1;

[2978] } while (!x[8]);

[2979] By widening x by a single bit and just checking the top bit, onemay remove an 8-bit comparison.

[2980] 6. Reduce complex expressions into a number of stages. Forexample:

[2981] x = a + b + c + d + e + f + g + h;

[2982] reduces to:

[2983] par

[2984] {

[2985] temp1 = a + b;

[2986] temp2 = c + d;

[2987] temp3 = e + f;

[2988] temp4 = g + h;

[2989] }

[2990] par

[2991] {

[2992] temp1 = temp1 + temp2;

[2993] temp3 = temp3 + temp4;

[2994] }

[2995] x = temp1 +temp3;

[2996] This code takes three clocks cycles as opposed to one but eachclock cycle is much shorter and so the rest of the circuit should bespeeded up by the faster clock rate permitted.

[2997] 7. Avoid long strings of empty statements. Empty statementsresult from, for example, missing else conditions from if statements.For example:

[2998] if (a>b)

[2999] x++;

[3000] if (b>c)

[3001] x++;

[3002] if (c>d)

[3003] x++;

[3004] if (d>e)

[3005] x++;

[3006] if (e>f)

[3007] x++;

[3008] If none of these conditions is met then all the comparisons maybe made in one clock cycle. By filling in the else statements withdelays, the long path through all these if statements can be split atthe expense of having each if statement take one clock cycle whether thecondition is true or not.

[3009] Pipelining

[3010] A classic way to increase clock rates in hardware is to pipeline.A pipelined circuit takes more than one clock cycle to calculate anyresult but can produce one result every clock cycle. The trade off is anincreased latency for a higher throughput so pipelining is onlyeffective if there is a large quantity of data to be processed—it is notpractical for single calculations. An example of a pipelined multiplieris given below. unsigned 8 sum[8]; unsigned 8 a[8]; unsigned 8 b[8];chanin inputa; chanin inputb; chanout output; par { while(1) inputa ?a[0]; while(1) inputb ? b[0]; while(1) output ! sum[7]; while(1) { par {macro proc level(x) par { sum[x] = sum[x − 1] + ((a[x][0] == 0) ? 0 :b[x]); a[x] = a[x − 1] >> 1; b[x] = b[x − 1] << 1; } sum[0] = ((a[0][0]== 0) ? 0 : b[0]); level(1); level(2); level(3); levcl(4); level(5);level(6); level(7); } } }

[3011] This multiplier calculates the 8 LSBs of the result of an 8-bitby 8-bit multiply using long multiplication. The multiplier produces oneresult per clock cycle with a latency of 8 clock cycles. This means thatalthough any one result takes 8 clock cycles, one gets a throughput of 1multiply per clock cycle. Since each pipeline stage is very simple,combinatorial logic is shallow and a much higher clock rate is achievedthan would be possible with a complete single cycle multiplier. At eachclock cycle, partial results pass through each stage of the multiplierin the sum array. Each stage adds on 2 n multiplied by the b operand ifrequired. The LSB of the a operand at each stage tells the multiplystage whether to add this value or not. Stages are generated with amacro procedure to avoid tedious repetition of code.

[3012] Operands are fed in on every clock cycle through a[0] and b[0].Results appear 8 clock cycles later on every clock cycle through sum[7]. . . 11 Targeting hardware.

[3013] Introduction

[3014] The previous sections have covered most aspects of writingHandel-C programs. What has not yet been discussed is how to get datainto and out of those programs. A major advantage of the custom hardwarethat can be produced with Handel-C is its ability to interface directlywith external components such as RAM, custom and non-custom buses. Thissection of the present description deals with getting data into and outfrom the Handel-C program. It begins with a discussion of using thesimulator provided with the Handel-C compiler to ensure that the programis correct and then describes interfacing with real hardware devicesconnected to the pins of the chip containing the hardware. The simulatorexecutes Handel-C programs on the compiling machine without anyadditional hardware. It allows output to the screen or a file and inputfrom the keyboard or a file. It is a powerful tool that allows programsto be tested thoroughly before custom hardware is manufactured. While nospecific hardware platform is detailed here, a number of examples aregiven of interfacing to theoretical hardware.

[3015] Interfacing with the Simulator

[3016] This section of the present description covers how the programcommunicates with the simulator. This enables one to debug with realdata. Code examples are set forth herein. Communication with thesimulator takes place over channels. They are declared using thekeywords chanin and chanout. It is assumed that simulation channelsnever block and may always complete a transfer in one clock cycle.

[3017] Simulator Channels are Declared Using chanin and chanout Insteadof chan

[3018] Transfers Over Channels

[3019] The special channels chanin and chanout may be defined forinputting information from the simulator and outputting information backto the simulator. These channels are normally connected to files,although an unconnected channel that outputs data to the simulator maybe displayed in the debug window. For example:

[3020] chanin unsigned Input with {infile = “./Data/source.dat”};

[3021] chanout unsigned Output;

[3022] input ? x;

[3023] output ! y;

[3024] This example declares two channels: one for input from thesimulator and one for output to the simulator. The input channelconnects to a file managed by the simulator; the output channel connectsto the simulator's standard output (the debug window in the Handel-CGUI). Standard channel communication statements can then be used totransfer data. One can declare multiple channels for input and output.For example:

[3025] chanin int 8 input_(—)1 with

[3026] {infile = “./Data/source_(—)1.dat”};

[3027] chanin int 16 input_(—)2 with

[3028] {infile = “./Data/source_(—)2.dat”};

[3029] chanout unsigned 3 output_(—)1;

[3030] chanout char output_(—)2;

[3031] input_(—)1 ? a;

[3032] input_(—)2 ? b;

[3033] output_(—)1 ! (unsigned 3)(((0@a)+b)<−3);

[3034] output_(—)2 ! b;

[3035] When simulated, such a program displays the name of channelsbefore outputting their value on the simulating computer screen.

[3036] Simulator Input File Format

[3037] The data input file should have one number per line separated bynewline characters (either DOS or Unix format text files may be used).Each number may be in any format normally used for constants byHandel-C. For example:

[3038] 56

[3039] 0x34

[3040] 0654

[3041] 0b001001

[3042] Block Data Transfers

[3043] The Handel-C simulator has the ability to read data from a fileand write results to another file. For example:

[3044] chanin int 16 input with {infile=“in.dat”};

[3045] chanin int 16 output with {outfile=“out.dat”};

[3046] void main (void)

[3047] {

[3048] while (1)

[3049] {

[3050] int value;

[3051] input ? value;

[3052] output ! value+1;

[3053] }

[3054] }

[3055] This program reads data from the file in.dat and writes itsresults to the file out.dat. If the in.dat file consists of:

[3056] }

[3057] 0x34

[3058] 0654

[3059] 0b001001

[3060] the out.dat may contain the decimal results as follows:

[3061] 57

[3062] 53

[3063] 429

[3064] 10

[3065] This feature allows algorithms to be debugged and tested withoutneeding to build actual hardware. For example, an image processingapplication may store a source image in a file and place its results ina second file. All that need be done outside the Handel-C compiler is aconversion from the image (e.g. JPEG file) into the text file taken bythe simulator and a conversion back from the output file to an imageformat. The results can then be viewed and the correct operation of theHandel-C program confirmed.

[3066] Targeting FPGA Devices

[3067] The Handel-C language is designed to target real hardwaredevices. One may supply some important pieces of information to thecompiler to allow it to do this. These are: .the FPGA family and partthat the design may be implemented in the location of a clock source TheFPGA part details are supplied through the Project>Settings dialog inthe GUI (Graphical User Interface). They can also be supplied to thecommand line compiler using the set statement. Ultimately, thisinformation is passed to the FPGA place and route tool to inform it ofthe device it should target. The clock source is specified using the‘set’ command.

[3068] Locating the Clock

[3069] Since each Handel-C main( ) code block generates synchronoushardware, a single clock source is required for each one. The clock isnormally provided on one of the external pins of the FPGA but may alsobe generated internally on Xilinx 4000 devices. The general syntax ofthe clock specification is: −set clock = Location;

[3070]FIG. 60 illustrates a table 6000 showing the various locations, inaccordance with one embodiment of the present invention.

[3071] Examples of clocks taken from external device pins are:

[3072] set clock = external “P35”;

[3073] set clock = external_divide “P35”3;

[3074] set clock = external_divide 3;.

[3075] The first of these examples specifies a clock taken from pin P35.The second of these examples specifies a clock taken from pin P35 whichis divided on the FPGA by a factor of 3. The third option shows a clockdivided by 3 with no pin number specified. When the pin number isomitted, the place and route tools may choose an appropriate pin.Omitting pin specifications can speed up the design. Examples of clockstaken from the Xilinx 4000 series internal clock generator are:

[3076] set clock =internal “F8M”;

[3077] set clock =internal_divide “F8M” 3;

[3078] Currently, the frequency of the internal clock may take one ofthe following values: Specification String Frequency “F15” 15 Hz “F490”490 Hz “F16K” 16 kHz “F500K” 500 kHz “F8M” 8 MHz

[3079] Note that the tolerance for these values is -50% to +25% so oneshould not rely on the internal clock being at exactly thesefrequencies. Internal clocks are only supported on Xilinx 4000 seriesparts. The clock division specified with the internal_divide andexternal_divide keywords may be a constant integer.

[3080] Targeting Specific Devices via the Command Line

[3081] If one is not using the GUI to specify the target device, he orshe may insert lines in the code to specify it. In order to target aspecific FPGA, the compiler may be supplied with the FPGA part number.Ultimately, this information is passed to the FPGA place and route toolto inform it of the device it should target. Targeting devices is in twoparts: specifying the target family and the target device. The generalsyntax is:

[3082] set family = Family;

[3083] set part = Chip Number;

[3084]FIG. 61 illustrates the various family names 6100, in accordancewith one embodiment of the present invention.

[3085] The part string is the complete Xilinx or Altera device string.For example:

[3086] set family = Xilinx4000E;

[3087] set part = “4010EPC84-1”;

[3088] This instructs the compiler to target a XC4010E device in aPLCC84 package. It also specifies that the device is a −1 speed grade.This last piece of information is required for the timing analysis ofthe design by the Xilinx tools.

[3089] The family is used to inform the compiler of which special blocksit may generate.

[3090] To target Altera Flex 1 OK devices:

[3091] set family = Alteral OK;

[3092] set part = “EPF10K20RC240-3”;

[3093] This instructs the compiler to target an Altera Flex 10K20 devicein a RC240 package. It also specifies that the device is a −3 speedgrade. This last piece of information is required for the timinganalysis of the design by the Altera Max Plus II or Quartus tools. Notethat when performing place and route on the resulting design, the deviceand package may also be selected via the menus in the Max Plus IIsoftware.

[3094] Use of RAMs and ROMs with Handel-C

[3095] Handel-C provides support for interfacing to on-chip and off-chipRAMs and ROMs using the ram and rom keywords. One can specify RAMs andROMs external to the Handel-C code by using the ports specificationkeyword. One can control the timing for read/write cycles by usingspecification keywords that define the relationship between the RAMstrobe and the Handel-C clock.

[3096] The usual technique for specifying timing in synchronous andasynchronous RAM is to have a fast external clock which is divided downto provide the Handel-C clock and used directly to provide the pulses tothe RAM

[3097] Asynchronous RAMs

[3098] There are three techniques for timing asynchronous RAMs,depending on the clock available

[3099] 1. Fast external clock. Use the Handel-C westart and welengthspecifications to position the write strobe

[3100] 2. External clock at the same speed as the Handel-C clock. Usemultiple reads to give the RAM enough time to respond.

[3101] 3. Use the wegate specification to position the write enablesignal within the Handel-C clock

[3102] Fast External Clock

[3103] If the external clock is faster than the internal clock (i.e. thelocation of the clock is internal_divide or external_divide with adivision factor greater than 1) then Handel-C can generate a writestrobe for the RAM which is positioned within the Handel-C clock cycle.This is done with the westart and welength specifications. For example:

[3104] set clock = external_divide “P78” 4;

[3105] ram unsigned 6 x[34] with { westart = 2,

[3106] welength =1 };

[3107] The write strobe can be positioned relative to the Handel-C clockcycle by half cycle lengths of the external (undivided) clock. The aboveexample starts the pulse 2 whole external clock cycles into the Handel-Cclock cycle and gives it a duration of 1 external clock cycle. Since theexternal clock is divided by a factor of 4, this is equivalent to astrobe that starts half way through the internal clock cycle and has aduration of one quarter of the internal clock cycle. FIG. 62 illustratesa timing diagram showing a signal 6200, in accordance with oneembodiment of the present invention.

[3108] Timing Diagram: Positioned Write Strobe

[3109] This timing allows half a clock cycle for the RAM setup time onthe address and data lines and one quarter of a clock cycle for the RAMhold times. This is the recommended way to access asynchronous RAMs.

[3110] Same Rate External Clock

[3111] This method uses multiple Handel-C RAM accesses to meet the setupand hold times of the RAM.

[3112] ram unsigned 6 x[34];

[3113] Dummy = x[3];

[3114] x[3] = Data;

[3115] Dummy = x[3];

[3116] This code holds the address constant around the RAM write cycle,enabling a write to an asynchronous RAM.

[3117] Undivided External Clock

[3118] The third method of accessing asynchronous RAMs is a compromisebetween the two previous methods. wegate is used with an undividedexternal clock and keeps the write strobe in the first or second half ofthe clock cycle. It is still necessary to hold the address constanteither in the clock cycle before or in the clock cycle after the access.For example:

[3119] ram unsigned 6 x[34] with { wegate =1 };

[3120] x[3] = Data;

[3121] Dummy = x[3];

[3122] This places the write strobe in the second half of the clockcycle (use a value of −1 to put it in the first half) and holds theaddress for the clock cycle after the write. The RAM therefore has halfa clock cycle of setup time and one clock cycle of hold time on itsaddress lines.

[3123] Synchronous RAMs

[3124] Handel-C timing semantics require that any assignment takes oneclock cycle. SSRAMs have a latency of at least one clock cycle. Toenable the SSRAM timings to fit with the Handel-C timing constraints,Handel-C uses an independent fast clock (RAMCLK). The generation of thisclock requires the use of a fast external clock (CLK), divided toprovide the Handel-C clock (HCLK).

[3125] The fast clock's pulses can then be placed to clock the SSRAMwithin a single Handel-C clock tick. The RAMCLK can be carried to anexternal SSRAM using the clk specification.

[3126] Handel-C supports ZBT-compatible (Zero Bus Turnaround)flow-through and pipelined output devices. DDR (double data rate) andQDR (quad data rate) devices are not supported.

[3127] The Handel-C compiler checks the block specification to find outwhat type of RAM is being built and generates the appropriatewrite-enable signal (e.g. active low for ZBT SSRAM devices andactive-high for block RAMs within Xilinx Virtex chips).

[3128] SSRAM Read and Write Cycles

[3129] Most inputs to SSRAMs are captured on the rising edge of theinput clock. During a read cycle there is a latency of at least oneclock cycle between an address being captured at the input and databecoming available at the output. This is also true for the write cyclein many devices: an address is captured on one clock cycle, and data onthe next.

[3130]FIG. 63 illustrates a timing diagram showing a SSRAM read andwrite 6300, in accordance with one embodiment of the present invention.

[3131] Specifying the Timing

[3132] One can place the RAM clock pulses at different points within theHandel-C clock in the same way that write-enable strobes can bespecified for asynchronous RAM devices. The SSRAM clock (RAMCLK) isgenerated from the fast clock (CLK) according to the Handel-Cspecifications: rclkpos, wclkpos and clkpulselen. These specificationscan be in whole or half cycles of the external clock.

[3133] rclkpos specifies the positions of the clock cycles of clockRAMCLK for a read cycle. These positions are specified in terms ofcycles and half-cycles of CLK, counting forwards from a HCLK risingedge.

[3134] wclkpos specifies the positions of the clock cycles of RAMCLK fora write cycle. These are also counted forward from an HCLK rising edge.

[3135] clkpulselen specifies the length of the RAMCLK pulses in CLKcycles. This is specified once per RAM. It applies to both the read andwrite clocks.

[3136] Timing Diagram: SSRAM Read Cycle Using Generated RAMCLK

[3137]FIG. 64 illustrates a timing diagram showing a SSRAM read cycleusing generated RAMCLK 6400, in accordance with one embodiment of thepresent invention. The pulse positions and lengths are specified incycles and half-cycles of CLK. The westart and welength specs are usedto place the write enable strobe where it is required.

[3138] Examples

[3139] Flow-Through SSRAM

[3140] This example code generates the hardware shown below. It is alsoapplicable for reading from block RAMs in Xilinx and Altera FPGAs.

[3141] ram unsigned 18 FlowBank[1024] with {block = 1,

[3142] westart = 2,

[3143] welength= 1,

[3144] rclkpos = {1.5},

[3145] wclkpos = {2.5, 3.5},

[3146] clkpulselen = 0.5};

[3147]FIG. 65 illustrates a timing diagram showing read-cycle from aflow-through SSRAM within a Handel-C design 6500, in accordance with oneembodiment of the present invention.

[3148] The rising HCLK edge at t0 initiates the read cycle. Some timelater, the address A1 is set up, which is sampled somewhere in themiddle of the HCLK cycle: t0+1.5 in this case. By the time the next HCLKrising edge occurs at t1, the data is available for reading. The cyclecompletes within one Handel-C clock cycle.

[3149] Write Cycle for a Flow-Through SSRAM

[3150] Flow-through SSRAMs perform a “late” write cycle; the data isclocked in one clock cycle after the address is sampled. FIG. 66illustrates a timing diagram showing complete write cycle 6600, inaccordance with one embodiment of the present invention.

[3151] The HCLK rising edge at tO initiates the write cycle, causing theADDRESS and DATAIN signals to change. Two cycles of RAMCLK are needed toclock the new data into the RAM at the specified address: the first tosample the address, the second to sample the data. However, since it isnot expected to read from the RAM's output, one can wait until the lastpossible moment. In this case, the two rising edges of RAMCLK occur att0+2.5 and t0+3.5.

[3152] The write enable signal may be low during the rising edge ofRAMCLK that samples the address, but not during the one that samples thedata. This can be done by setting westart and welength as shown. Theentire cycle completes within a single Handel-C clock cycle.

[3153] Pipelined-Output SSRAM

[3154] This example code generates the hardware shown below

[3155] ram unsigned 18 PipeBank[1024] with {block = 1,

[3156] westart = 1.5,

[3157] welength = 1,

[3158] rclkpos = {1.5, 2.5},

[3159] wclkpos = {2, 3, 4},

[3160] clkpulselen = 0.5;

[3161]FIG. 67 illustrates a timing diagram showing complete read cycle6700, in accordance with one embodiment of the present invention. Thisread cycle is very similar to that for a flow through RAM. The risingHCLK edge at t0 initiates the read cycle. Some time later, the addressA1 is set up, which is sampled somewhere near the middle of the HCLKcycle: (t0+1.5 in this case). The RAM contents at address A1 are thenpiped to the RAM's output register; it may be made available at the RAMoutput. A second RAMCLK pulse (at t0+2.5 in this case) is used to dothis. By the time the next HCLK rising edge occurs at t1, the data isavailable for reading by the Handel-C design. The cycle completes withinone Handel-C clock cycle.

[3162] Write Cycle for a Pipelined-Output SSRAM

[3163] Pipelined-output SSRAMs perform a “late-late” write cycle. Thismeans that data is written to memory two clock cycles after the addressis sampled. FIG. 68 illustrates a timing diagram showing complete cycle6800, in accordance with one embodiment of the present invention. TheHCLK rising edge at t0 initiates the write cycle, causing the ADDRESSand DATAIN signals to change. Three cycles of RAMCLK are needed to clockthe new data into the RAM at the specified address: the first to samplethe address and the third to sample the data. Since one may not readfrom the RAM on a write strobe, he or she can sample the data as late aspossible to give the circuit maximum time to set up the data. In thiscase, the three rising edges of RAMCLK occur at t0+2.0, t0+3.0 andt0+4.0. The write enable signal may be low during the rising edge ofRAMCLK that samples the address, but not during the one that samples thedata. This can be done by setting westart and welength as shown. Theentire cycle completes within a single Handel-C clock cycle.

[3164] Using On-Chip RAMs in Xilinx Devices

[3165] Xilinx 4000 series devices can implement RAMs and ROMs in thelook up tables on the device. Handel-C supports the synchronous RAMs onthe 4000E, 4000EX, 4000L, 4000XL, 400OXV and Virtex series partsdirectly simply by declaring a RAM or ROM in the way described earlier.

[3166] For example:

[3167] ram unsigned 6 x[34];.

[3168] This may declare a RAM with 34 entries, each of which is 6 bitswide.

[3169] Using On-Chip RAMs in Altera Devices

[3170] On-chip RAMs in Altera Flex OK devices use the EAB structures.These blocks can be configured in a number of data width/address widthcombinations. When writing Handel-C programs, one may be careful not toexceed the number of EAB blocks in the target device or the design maynot place and route successfully. While it is possible to use RAMs thatdo not match one of the data width/address width combinations, EAB spacemay be wasted by such a RAM. As with Xilinx devices, the RAM blocks inFlex 10K parts can be configured to be either synchronous orasynchronous.

[3171] Synchronous Access

[3172] By default, Handel-C may use a synchronous access by utilizingthe falling edge of the clock as the input clock signal to the RAM. Thisis the recommended method for using RAMs.

[3173] Asynchronous Access

[3174] If one uses one of the westart, welength or wegate specificationsdescribed in the previous section, the Handel-C compiler may generate anasynchronous RAM. This may help with the timing characteristics of thedesign.

[3175] Initialisation

[3176] RAM/ROM initialisation files with a mif extension may begenerated on compilation to feed into the Max Plus II software. Thisprocess is transparent as long as they are in the same directory as theEDIF (.edf extension) file generated by the Handel-C compiler.

[3177] Using External RAMs

[3178] Asynchronous RAMs

[3179] Handel-C provides support for accessing off-chip static RAMs inthe same way as one access internal RAMs. The syntax for an external RAMdeclaration is: ram Type Name[Size] with { offchip = 1, data = Pins,addr = Pins, we = Pins, oe = Pins, cs = Pins};

[3180] For example, to declare a 16 Kbyte by 8-bit RAM: ram unsigned 8ExtRAM[16384] with { offchip = 1, data = {“P1”, “P2”, “P3”, “P4”, “P5”,“P6”, “P7”, “P8”}, addr = {“P9”, “P10”, “P11”, “P12”, “P13”, “P14”,“P15”, “P16”, “P17”, “P18”, “P19”, “P20”, “P21”, “P22”}, we = {“P23”},oe = {“P24”}, cs = {“P25”}};

[3181] Note that the lists of address and data pins are in the order ofmost significant to least significant. It is possible for the compilerto infer the width of the RAM (8 bits in this example) and the number ofaddress lines used (14 in this example) from the RAM's usage. However,this is not recommended since this declaration deals with real externalhardware which has a fixed definition. Accessing the RAM is the same asfor accessing internal RAM. For example:

[3182] ExtRAM[1234] = 23;

[3183] y = ExtRAM[5678];

[3184] Similar restrictions apply as with internal RAM—only one accessmay be made to the RAM in any one clock cycle. The compiled hardwaregenerates the following cycle for a write to external RAM. FIG. 69illustrates a timing diagram showing a cycle for a write to external RAM6900, in accordance with one embodiment of the present invention. FIG.70 illustrates a timing diagram showing a cycle for a read from externalRAM 7000, in accordance with one embodiment of the present invention.

[3185] This cycle may not be suitable for the RAM device in use. Inparticular, asynchronous static RAM may not work with the above cycledue to setup and hold timing violations. For this reason, the westart,welength and wegate specifications may also be used with external RAMdeclarations.

[3186] Fast External Clock Example

[3187] For example, to declare a 16 Kbyte by 8-bit RAM: set clock =external_divide “P99” 4; ram unsigned 8 ExtRAM[16384] with { offchip =1, westart = 2, welength = 1, data = {“P1”, “P2”, “P3”, “P4”, “P5”,“P6”, “P7”, “P8”}, addr = {“P9”, “P10”, “P11”, “P12”, “P13”, “P14”,“P15”, “P16”, “P17”, “P18”, “P19”, “P20”, “P21”, “P22”}, we = {“P23”},oe = {“P24”}, cs = {“P25”}};

[3188]FIG. 71 illustrates a timing diagram showing a cycle for a writeto external RAM 7100, in accordance with one embodiment of the presentinvention. FIG. 72 illustrates a timing diagram showing a cycle for aread from external RAM 7200, in accordance with one embodiment of thepresent invention.

[3189] The compiled hardware generates the following cycle for a writeto external RAM.

[3190] The compiled hardware generates the following cycle for a readfrom external RAM:

[3191] Accessing the RAM is the same as for accessing internal RAM. Forexample:

[3192] ExtRAM[1234] = 23;

[3193] y = ExtRAM[5678];

[3194] Similar restrictions apply as with internal RAM—only one accessmay be made to the RAM in any one clock cycle.

[3195] Wegate Wxample

[3196] wegate specification may be used when a multiplied clock is notavailable.

[3197] For example, to declare a 16 Kbyte by 8-bit RAM: ram unsigned 8ExtRAM[16384] with { offchip = 1, wegate = 1, data = {“P1”, “P2”, “P3”,“P4”, “P5”, “P6”, “P7”, “P8”}, addr = {“P9”, “P10”, “P11”, “P12”, “P13”,“P14”, “P15”, “P16”, “P17”, “P18”, “P19”, “P20”, “P21”, “P22”}, we ={“P23”}, oe = {“P24”}, cs = {“P25”}};

[3198]FIG. 73 illustrates a timing diagram showing a cycle for a writeto external RAM 7300, in accordance with one embodiment of the presentinvention. FIG. 74 illustrates a timing diagram showing a cycle for aread from external RAM 7500, in accordance with one embodiment of thepresent invention.

[3199] Accessing the RAM is the same as for accessing internal RAM. Forexample:

[3200] ExtRAM[3] = Data;

[3201] Dummy = ExtRAM[3];

[3202] Similar restrictions apply as with internal RAM—only one accessmay be made to the RAM in any one clock cycle. Note that the timingdiagram above may violate the hold time for some asynchronous RAMdevices. If the delay between rising clock edge and rising write enableis longer than the delay between rising clock edge and the change indata or address then corruption in the write may occur in these devices.The two cycle access does not solve this problem since it is notpossible to hold the data lines constant beyond the end of the clockcycle. If this causes a problem then a multiplied external clock may beused as described above.

[3203] Using the wegate specification may violate the hold time for someasynchronous RAM devices.

[3204] Synchronous RAMs

[3205] Off-chip synchronous SRAMs can be specified in exactly the sameway as on-chip synchronous SRAMs, with the addition of the clkspecification. clk specifies the pin on which the generated RAMCLK mayappear, when the SSRAM in question is external (offchip = 1).

[3206] Example macro expr addressPins = {Pin List. . .}; macro exprdataPins = {Pin List. . .}; macro expr csPins = {Pin List. . .}; macroexpr wePins = {Pin List. . .}; macro expr oePins = {Pin List. . .};macro expr clkPins = {Pin List. . .}; ram unsigned 32 ExtBank[1024] with{offchip = 1, addr = addressPins, data = dataPins, cs = csPins, we =wePins, oe = oePins, westart = 2, welength = 1, rclkpos = {1.5, 2.5},wclkpos = {1.5, 2.5, 3.5}, clkpulselen = 0.5, clk = clkPins};

[3207] Using External ROMs

[3208] An external ROM is declared as an external RAM with an emptywrite enable pin list. For example: ram unsigned 8 ExtROM[16384] with {offchip = 1, data = {“P1”, “P2”, “P3”, “P4”, “P5”, “P6”, “P7”, “P8”},addr = {“P9”, “P10”, “P11”, “P12”, “P13”, “P14”, “P15”, “P16”, “P17”,“P18”, “P19”, “P20”, “P21”, “P22”}, we = {}, oe = {“P24”}, cs ={“P25”}};

[3209] Note that no westart, welength or wegate specification isrequired since there is not a write strobe signal on a ROM device.

[3210] Connecting to RAMs in Foreign Code

[3211] One can create ports to connect to a RAM by using the ports = 1specification to the memory definition. This may generate VHDL or EDIFwires which can be connected to a component created elsewhere. The portsspecification cannot be used in conjunction with the offchip=1specification, but all other specifications may apply. The interfacegenerated may have separate read (output) and write (data) ports, writeenable, data enable and clock wires. This ensures that it can beconnected to any device. Pin names provided in the addr, data, cs,we,oe, and clk specifications may be passed through to the generated EDIF.They are not passed through to VHDL, since VHDL interfaces are generatedas n-bit wide buses rather than n 1-bit wide wires. This means that itis ambiguous to specify a separate identifier for each wire If they areused when compiling to VHDL, the compiler issues a warning.

[3212] A clock port may always be generated.

[3213] If one uses the ports specification with an MPRAM, a separateinterface may be generated for each port.

EXAMPLES Example 1 Generating an interface to a foreign code RAM.

[3214] // Pin name specifications have been commented out set family =Xilinx4000XV; set part = ”V1000BG560-4”; set clock = external “C1”;macro expr dataPins = {”D1”, ”D2”, “D3”, ”D4”}; macro expr addrPins ={”A1”, ”A2”}; macro expr wePins = {“WE1”}; macro expr csPins = {”CS1”};macro expr oePins = {”OE1”}; unsigned 4 a; ram unsigned 4 rax[4] with{ports = 1/*, data = dataPins, addr = addrPins, we = wePins, cs =csPins, oe = oePins*/}; void main(void) { static unsigned 2 i = 0;while(1) { par { i++; a++; rax[i] = a; } a = rax[i]; } } The declarationof rax would produce wires rax_SPPort_addr<0> // Addressrax_SPPort_addr<1> rax_SPPort_data_in<0> // Data Inrax_SPPort_data_in<1> rax_SPPort_data_in<2> rax_SPPort_data_in<3>rax_SPPort_data_out<0> // Data Out rax_SPPort_data_out<1>rax_SPPort_data_out<2> rax_SPPort_data_out<3> rax_SPPort_data_en // DataEnable rax_SPPort_clk // Clock rax_SPPort_cs // Chip Selectrax_SPPort_oe // Output Enable rax_SPPort_we // Data In.

Example 1 Generating an Interface to a Foreign Code MPRAM.

[3215] // Pin name specifications have been commented out set family =Xilinx4000XV; set part = “V1000BG560-4”; set clock = external “C1”;macro expr dataPins = {“D1”, “D2”, “D3”, “D4”}; macro expr addrPins ={“A1”, “A2”}; macro expr wePins = {“WE1”}; macro expr csPins = {“CS1”};macro expr oePins = {“OE1”}; unsigned 4 a; mpram Mpaz { wom unsigned 4wox[4]; rom unsigned 4 rox[4]; } mox with {ports = 1/*, data = dataPins,addr = addrPins, we = wePins, cs = csPins, oe = oePins*/}; voidmain(void) { static unsigned 2 i = 0; while(1) { par { i++; a++;mox.wox[i] = a; } a = mox.rox[i]; } }

[3216] Using Other RAMs

[3217] The interface to other types of RAM such as DRAM should bewritten by hand using interface declarations described in the followingsections. Macro procedures can then be written to perform complex oreven multi-cycle accesses to the external device.

[3218] Interfacing with External Hardware and Logic

[3219] While the simulator allows debugging of Handel-C programs, thereal target of the compiler is hardware. It is therefore essential thatthe compiler can generate hardware that interfaces with externalcomponents. These next sections detail the building blocks of suchhardware interfaces. All off-chip accesses are based on the idea of abus which is just a collection of external pins. Handel-C provides theability to read the state of pins for input from the outside world andset the state of pins for writing to the outside world. Tri-state busesare also supported to allow bi-directional data transfers through thesame pins. The pins used may be defined in Handel-C by using the dataspecification. If this is omitted, the pins may be left unconstrainedand can be assigned by the place and route tools. Note that Handel-Cprovides no information about the timing of the change of state of asignal within a Handel-C clock cycle. Timing analysis is available fromthe FPGA manufacturer's place-and-route tools. Handel-C programs canalso interface to external logic (other Handel-C programs, programswritten in VHDL etc.) by using user-defined interfaces or Handel-Cports.

[3220] Interfaces

[3221] All interfaces other than RAMs are declared with the interfacekeyword. The general syntax of interfaces is as follows: interfaceSort(Types) Name(Args) with {Specs}; Here, the Sort field specifies whatsort of interface is required, Types describes the types of valuesassociated with objects coming from the interface, Name specifies anidentifier for the interface, Args specifies any parameters that theinterface may require and Specs give hardware details of the interfacesuch as chip pin numbers. Further details of the interface syntax wereprovided earlier.

[3222]FIG. 75 is a table of pre-defined interface sorts 7500, inaccordance with one embodiment of the present invention.

[3223] Reading from External Pins

[3224] The bus_in interface sort allows Handel-C programs to read fromexternal pins. Its general usage is:

[3225] interface bus_in(type portName) Name( )

[3226] with {data = (Pin List}};

[3227] A specific example is:

[3228] interface bus_in(int 4 To) InBus( ) with {data =

[3229] {“P1”, “P2”, “P3”, “P4”};

[3230] This declares a bus connected to pins P1, P2, P3 and P4 where pinP1 is the most significant bit and pin P4 is the least significant bit.Reading the bus is performed by accessing the identifier Name.portNameas a variable which may return the value on those pins at that clockedge. For example:

[3231] int 4 x;

[3232] x = InBus.To;

[3233] This sets the variable x to the value on the external pins. Thetype of InBus.To is int 4 as specified in the type list after the bus-inkeyword.

[3234] If no input port name is given, the port name defaults to in.

[3235] Registered Reading from External Pins

[3236] The bus_latch_in interface sort is similar to the bus_ininterface sort but allows the input to be registered on a condition.This may be required to sample the signal at particular times. Itsgeneral usage is:

[3237] interface bus_latch-in(type portName)

[3238] Name(type conditionPortName=Condition)

[3239] with {data = (Pin List}};.

[3240] Its usage is exactly like the bus_in interface sort except thatCondition specifies a signal that is used to clock the input registersin the FPGA. The rising edge of this signal clocks the external signalinto the internal value. For example: int 1 get; int 4 x; interfacebus_latch_in(int 4 To) InBus(int 1 condition = get) with {data = {“P1”,“P2”, “P3”, “P4”}}; get = 0; get = 1; // Register the external value x =InBus.To; // Read the registered value

[3241] Clocked Reading from External Pins

[3242] The bus_clock_in interface sort is similar to the bus_ininterface sort but allows the input to be clocked continuously from theHandel-C global clock. This may be required to synchronize the signal tothe Handel-C clock. Its general usage is:

[3243] interface bus_clockjin(type portName) Name( )

[3244] with {data = tPin List}};

[3245] Its usage is exactly like the bus_in interface sort. The risingedge of the Handel-C clock clocks the external signal into the internalvalue. For example:

[3246] interface bus_clockjin(int 4 InTo) InBus( ) with

[3247] {data {P1”, “P2”, “P3”, “P4”}};

[3248] x = InBus.InTo; // Read flip-flop value

[3249] Writing to External Pins

[3250] The bus_out interface sort allows Handel-C programs to write toexternal pins. Its general usage is:

[3251] interface bus_out( ) Name(type portName=Expression)

[3252] with {data = tPin List}};

[3253] A specific example is:

[3254] interface bus out ( ) OutBus(int 4 OutPort=x+y)

[3255] with {data=

[3256] {“Pz1”, “P2”, “P3”, “P4”};

[3257] This declares a bus connected to pins 1, 2, 3 and 4 where pin 1is the most significant bit and pin 4 is the least significant bit. Thevalue appearing on the external pins is the value of the expression x+yat all times.

[3258] Bi-Directional Data Transfer

[3259] The bus_ts interface sort allows Handel-C programs to performbi-directional off-chip communications via external pins. Its generalusage is:

[3260] interface bus_ts (type inPortName)

[3261] Name(type outPortName=Value,

[3262] type conditionPortName = Condition)

[3263] with {data = (Pin List}};

[3264] Here, Value and Condition are two expressions. Value refers tothe value to output on the pins and Condition refers to the conditionfor driving the pins. When the second expression is non-zero (i.e.true), the value of the first expression is driven on the pins. When thevalue of the second expression is zero, the pins are tri-stated and thevalue of the external bus can be read using the identifierName.inPortName in much the same way that bus_in interfaces are read. IfinPortName is not defined, the port name defaults to in.

[3265] A specific example is:

[3266] int 1 condition;

[3267] int 4 x;

[3268] interface bus_ts(int 4 read)

[3269] BiBus(int write=x+1,

[3270] int 1 enable= condition)

[3271] with {data = {“P1”, “P2”, “P3”, “P4”}};

[3272] condition = 0; // Tri-state the pins

[3273] x = BiBus.read; // Read the value

[3274] condition = 1; /1 Drive x+1 onto the pins

[3275] This example reads the value of the external bus into variable xand then drives the value of x + 1 onto the external pins. Take carewhen driving tri-state buses that the FPGA and another device on the buscannot drive simultaneously as this may result in damage to one or bothof them.

[3276] Bi-Directional Data Transfer with Registered Input

[3277] The bus_ts_latch_in interface sort allows Handel-C programs toperform bi-directional off-chip communications via external pins withinputs registered on a condition. Its general usage is:

[3278] interface bus-ts-latch in (type inPortName)

[3279] Name(type outPortName=Value,

[3280] type conditionPortName = Condition,

[3281] type clockPortName = Clock)

[3282] with {data = tPin List}};

[3283] Here, Value, Condition and Clock are all expressions. Valuerefers to the value to output on the pins, Condition refers to thecondition for driving the pins and Clock refers to the signal to clockthe input from the pins. When the second expression is non-zero, thevalue of the first expression is driven on the pins. When the value ofthe second expression is zero, the pins are tri-stated and theregistered value of the external bus can be read using the identifierName.inPortName in much the same way that bus_in interfaces are read. IfinPortName is not defined, the port name defaults to in. The rising edgeof the value of the third expression clocks the external values throughto the internal values on the chip. For example:

[3284] int 1 get;

[3285] int 1 condition;

[3286] int 4 x;

[3287] interface bus-ts_latch_in(int 4 read)

[3288] BiBus(int write=x+1,

[3289] int 1 enable = condition,

[3290] int 1 clock = get)

[3291] with {data = {“P1”, “P2”, “P3”, “P4”}};

[3292] condition = 0; // Tri-state external pins

[3293] get = 0;

[3294] get = 1; // Register external value

[3295] x = BiBus.read; // Read registered value

[3296] condition = 1; // Drive x+1 onto external pins

[3297] This example samples the external bus and reads the registeredvalue into variable x and then drives the value of x + 1 onto theexternal pins. Take care when driving tri-state buses that the FPGA andanother device on the bus cannot drive simultaneously as this may resultin damage to one or both of them.

[3298] Bi-Directional Data Transfer with Clocked Input

[3299] The bus_ts_clock_in interface sort allows Handel-C programs toperform bidirectional off-chip communications via external pins withinputs clocked continuously with the Handel-C clock. Its general usageis:

[3300] interface bus_ts_clockjin (type inPortName)

[3301] Name(type outPortName=Value,

[3302] type conditionPortName = Condition)

[3303] with {data = {Pin List}

[3304] Here, Value and Condition are expressions. Value refers to thevalue to output on the pins and Condition refers to the condition fordriving the pins. When the Condition is non-zero (i.e. true), the valueof Value is driven on the pins. When the value of Condition is zero, thepins are tri-stated and the value of the external bus can be read usingthe identifier Name.InPortName in much the same way that bus_ininterfaces are read.

[3305] The rising edge of the Handel-C clock reads the external valuesinto the internal flip-flops on the chip. For example:

[3306] int 1 condition;

[3307] int 4 x;

[3308] interface bus_ts-clockjin (int 4 read

[3309] BiBus(int 4 writePort=x+1,

[3310] int 1 enable=condition)

[3311] with {data = {“P1”, “P2”, “P3”, “P4”}};

[3312] condition = 0; // Tri-state external pins

[3313] x = BiBus.read; // Read registered value

[3314] condition = 1; // Drive x+1 onto external pins

[3315] This example reads the value from the flip-flop into variable xand then drives the value of x + 1 onto the external pins. Take carewhen driving tri-state buses that the FPGA and another device on the buscannot drive simultaneously as this may result in damage to one or bothof them.

[3316] Merging Pins

[3317] It is possible to merge pins:

[3318] merge input pins with double declarations of input bus interfaces

[3319] merge tri-state pins

[3320] Input pins can be merged so that pins can be read simultaneouslyinto multiple variables. This can be done by specifying multipleinterfaces (bus_in, bus_clock_in, bus_latch-in) which have some pins incommon. If required, a different subset of pins can be specified foreach instance of the interface. For example:

[3321] interface bus_in(int 8 wide) wideDataBus( ) with

[3322] {data = {“P1”, “P2”, “P3”, “P4”, “P5”,

[3323] “P6”, “P7”, “P8”};

[3324] interface bus_in(int 3 thin) thinDataBus( ) with

[3325] {data = {“P3”, “P4”, “P5”}};

[3326] wideDataBus.in would give the values of pins 1-8, whereasthinDataBus.in would give the three bit value on pins 3,4 and 5.Tri-state bus pins can be merged, though doing so may generate acompiler warning, as the compiler cannot detect whether there is aconflict in the use of the merged pins. One might wish to merge outputpins for a tri-state bus if he or she wished to switch the circuitconnections from one external piece of logic to another. For example:

[3327] int 1 en1, en2;

[3328] int 4 x, y;

[3329] interface bus_ts_clockjin (int 4 read

[3330] BiBus1( int 4 writePort=x+1, en1== 1)

[3331] with {data = {“P1”, “P2”, “P3”, “P4”}};

[3332] interface bus_ts_clock_in (int 4 read

[3333] BiBus2(int 4 writePort=y+1, en2==1)

[3334] with {data = {“P1”, “P2”, “P3”, “P4”}};

[3335] Take care when driving tri-state buses that the FPGA and anotherdevice on the bus cannot drive simultaneously as this may result indamage to one or both of them.

[3336] Buses and the Simulator

[3337] The Handel-C simulator cannot simulate buses directly. Therecommended process for debugging is to use the channel method outlinedearlier in this section of the present description. This is because thesimulation of buses cannot determine when input and output should occur.

[3338] By using the #define and #ifdef . . . #endif constructs of thepreprocessor, it is possible to combine both the simulation and hardwareversions of the program into one. For example:

[3339] #define SIMULATE

[3340] #ifdef SIMULATE

[3341] input ? value;

[3342] #else

[3343] value = BusIn.in;

[3344] #endif

[3345] Refer to the Handel-C Preprocessor section for details ofconditional compilation. Simulation of buses may be important whendebugging the interface with the outside world. In this case, one canuse the Application Programmers Interface (API) to write a plugin whichcan be co-simulated. For example, to simulate a tri-state bus:

[3346] #ifdef SIMULATE

[3347] interface bus_ts (uint 32 in with

[3348] {extlib=“cosim_hc.dll”,extinst=“1”,extfuinc=“DataBusIn” })

[3349] DataBus (DataOut with {extlib=“cosim hc.dll”,

[3350] extinst=“1”, extfuinc=“DataBusOut”},

[3351] !WriteBus.in with {extlib=“cosim_hc.dll”,

[3352] extinst=”1, extfunc=“DataBusEnable”}

[3353] #else

[3354] interface bus_ts (uint 32 in with {data = pinList})

[3355] DataBus (DataOut, !WriteBus.in)

[3356] with {data = pinList})

[3357] #endif

[3358] In this case, the functions DataBusIn, DataBusOut andDataBusEnable would be provided in the plugin cosim_hc.dll and called bythe simulator. Details of using the API to write plugins are givenherein.

[3359] Timing Considerations for Buses

[3360] It is sometimes important to be aware of the timing of theexternal interfaces. While Handel-C without hardware libraries does notallow one to control exact timings, some care when writing code canallow enough control to make such interfaces work. The firstconsideration is for bus_in interfaces. This form of bus is built withno register between the external pin and the points inside the FPGAwhere the data is used. Thus, if the value on the external pin changesasynchronously with the Handel-C clock then routing delays within theFPGA can cause the value to be read differently in different parts ofthe circuit. For example: interface bus_in(int 1 read) a() with {data ={“P1”}}; par { x = a.read; y = a.read; }

[3361] Even though a.read is assigned to both x and y on the same clockcycle, if the delay from pin 1 to the flip-flop implementing the xvariable is significantly different from that between pin 1 and theflip-flop implementing the y variable then x and y may end up withdifferent values. This can be seen by considering the timing of somesignals.

[3362] Here, the delay between pin 1 and the input of y is slightlylonger than the delay between pin 1 and the input to x. As a result,when the rising edge of the clock registers the values of x and y, thereis one clock cycle when x and y have different values. FIG. 76illustrates a timing diagram 7600, in accordance with one embodiment ofthe present invention.

[3363] This effect can also occur in places that are more obscure. Forexample:

[3364] interface bus_in(int 1 read) a( ) with

[3365] {data = {“P1”}};

[3366] while (a.read==1)

[3367] {

[3368] x = x + 1;

[3369] }

[3370] In this example, although a.read is only apparently used once,the implementation of a while loop requires the signal to be routed totwo different locations giving the same problem as before. The solutionto this problem is to use either a bus_latch_in or a bus_clock_ininterface sort.

[3371] There is also a timing issue with output buses that needs carewhen designing interface hardware. In this case, the value output onpins cannot be guaranteed except at rising Handel-C clock edges. Inbetween clock edges, the value may be in the process of changing.

[3372] Since the routing delays through different parts of the logic ofthe output expression are different, some pins may change before othersgiving rise to intermediate values appearing on the pins. This isparticularly apparent in deep combinatorial logic. For example:

[3373] int 8x;

[3374] int 8 y;

[3375] interface bus_out( ) output(write=x * y)

[3376] with {data = {“P1”, “P2”, “P3”, “P4”,

[3377] “P5”, “P6”, “P7”, “P8”}};

[3378] Here, a multiplier contains deep logic so some of the 8 pins maychange before others leading to intermediate values. It is possible tominimize this effect (although not eliminate it completely) by adding avariable before the output. This effectively adds a flip-flop to theoutput. The above example then becomes:

[3379] int 8 x;

[3380] int 8 y;

[3381] int 8 z;

[3382] interface bus_out( ) output(write=z)

[3383] with {data = {“P1”, “P2”, “P3”, “P4”,

[3384] “P5”, “P6”, “P7”, “P8”}};

[3385] z = x * y;

[3386] Care may now be taken because the value of z may be updatedwhenever the value output on the bus may change. Race conditions withinthe combinatorial logic can lead to glitches on output pins betweenclock edges. When this happens, pins may glitch from 0 to 1 and back tozero or vice versa as signals propagate through the combinatorial logic.Adding a flip-flop at the output in the manner described above removesthese effects. These considerations should also be taken into accountwhen using bi-directional tri-state buses since these are effectively acombination of an input bus and an output bus.

[3387] Metastability

[3388] The output of a digital logic gate is a voltage level thatnormally represents either ‘0’ or ‘1’. If the voltage is below the lowthreshold, it represents 0 and if it is above the high threshold, itrepresents 1. However, if the voltage input to a register or latch isbetween these thresholds on the clock edge, then the output of thatregister may be indeterminate for a time before reverting to one of thenormal states. The state to which it reverts and the time at which itreverts cannot be predicted. This is called metastability, and can occurwhen data is clocked into a register during the time when the data ischanging between the two normal voltage levels representing 0 and 1. Itis therefore an important consideration for Handel-C programs that mayclock in data when the data is changing state. The metastabilitycharacteristics of digital logic devices vary enormously. For adiscussion of Xilinx FPGAs see the Xilinx FPGA data sheet (reference 2).This section puts the problem into perspective. For example a XC4000Edevice clocking a 1 MHz data signal with a 10 MHz clock is expected onlyonce in a million years to take longer than 3ns to recover from ametastable state to a stable state. So when designing a system examinethe metastability characteristics of the devices under the conditions inwhich they may be used to determine whether any precautions need betaken.

[3389] The ideal system is designed such that when data is clocked intoa register it is guaranteed to be stable. This can be achieved by usingintermediate buffer storage between the two systems that aretransferring data between each other. This storage could be a singledual-port register, dual-port memory, FIFO, or shared memory.Handshaking flags are used to indicate that data is ready, and that datahas been read. However even in this situation sampling of the flagscould cause metastability. The solution is to clock the flag into theHandel-C program more than once, so it is clocked into one register, andthe output of that register is then clocked into another register. Onthe first clock the flag could be changing state so the output could bemetastable for a short time after the clock. However, as long as theclock period is long relative to the possible metastable period, thesecond clock may clock stable data. Even more clocks further reduce thepossibility of metastable states entering the program, however the movefrom one clock to two clocks is the most significant and should beadequate for most systems.

[3390] The example below has 4 clocks. The first is in the bus_clock_inprocedure, and the next 3 are in the assignments to the variables x, y,and z. int 4 x,y,z; interface bus_clock_in(int 4 read) InBus() with{data = {“P1”, “P2”, “P3”, “P4”}}; par { while(1) x = InBus.read;while(1) y = x; { ...... z = y; } }

[3391] Remember to keep the problem in perspective by examining thedetails of the system to estimate the probability of metastability.Design the system in the first place to minimize the problem bydecoupling the FPGA from external synchronous hardware by using externalbuffer storage.

[3392] Metastability Across Clock Domains

[3393] There are particular metastability issues when dealing withcommunications across clock domains. Channels that connect between clockdomains are uni-directional point-to-point. The timing between domainsis unspecified, but the transmission is guaranteed to occur, and bothsides may wait until the transmission has completed. For example://File: transmit.c chan 8 c; // channel may have global scope voidmain(void) { int 8 x, y; c ! x; //program may wait until data//successfully transmitted c ! y; } //File: receive.c extern chan c;void main(void) { int 8 p, q; c ? p; c ? q; }

[3394] Ports

[3395] If one is dealing with hardware components in separate clockdomains, one may need to insert resynchronising hardware if it is notincluded in the components. For example, if data is sent from port_out Ain domain bbA and received from port_in B in domain bbB, the data may beresynchronized to the clock in domain bbB. This can be done by using thedata at least once in the Handel-C wrapper file.

[3396] The example below shows the three files required to connect twoEDIF blocks (bbA and bbA) which use different clocks. The small filesbbA.c and bbB.c connect to the EDIF code using the port_out from andport_in to interfaces. The metastable.c file generates one flip -flopthat resynchronizes the data by reading the value from bbA into avariable. File: metastable.c /* * Black box code to resynchronize *Needs to be clocked from the reading clock * (i.e. bbB.c's clock) */ int1 x; interface bbA (int 1 from) A( ); interface bbB ( ) B(int 1 to=x);main ( ) { while (1) { /* stabilize the data by adding *resynchronization FF */ x = A.from; } } File: bbA.c /* * Domain bbA *Connects to bbA.edf */ void main (void) { int 1 y; interface port_out () from (int 1 from = y); } File: bbB.c /* *Domain bbB * Connects tobbB.edf */ void main (void) { int 1 q: interface port_in (int 1 to) to (); par { while (1) { q = to.to: // Read data } } }

[3397] Alternatively, the resynchronising flip-flop can be placed in thefile that reads the data from the foreign code block. File: toplevel.c/* * Code to connect data between two cores */ interface bbA (int 1from) A( ); interface bbB ( ) B(int 1 to=A.from); File: bbA.c /* *Domain bbA * Compiles to bbA.edf */ void main (void) { int 1 y;interface port_out ( ) from (int 1 from = y); }.Handel-C Language File:bbB.c /* *Domain bbB * Complies to bbB.edf */ void main (void) { int 1q, y; interface port_in (int 1 to) to ( ); while (1) { par { q = to.to;// Resynchronize data y = q; } } }

[3398] Interfacing with External Logic

[3399] Handel-C provides the interface sorts port_in and port_out. Theseallow one to have a set of wires, unconnected to pins, which he or shecan use to connect to a simulated device or to another function withinthe FPGA. It is assumed that Handel-C has supplied an interfacedeclaration for these sorts, and one supply an instance definition.

[3400] port_in

[3401] For a port in, one defines the port(s) carrying data to theHandel-C code and any associated specifications.

[3402] interface port_in(Type data_TO_hc [with {port_specs}])

[3403] Name( ) [with {Instance_specs}];

[3404] For example:

[3405] interface port_in(int 4 signals_to_HC) read( );

[3406] One can then read the input data from the variableName.data_TO_hc, in this case read.signals_to_HC

[3407] port_out

[3408] For a port out, one define the port(s) carrying data from theHandel-C code, the expression to be output over those ports, and anyassociated specifications.

[3409] interface port_out( ) Name(Type data_FROM_hc=output_Expr[with(port_specs}])

[3410] [with {Instance_specs}];

[3411] For example:

[3412] int X_out;

[3413] interface port-out( )

[3414] drive(int 4 signals_from_HC = X_out);

[3415] In this case, the width of X_out would be inferred to be 4, asthat is the width of the port that the data is sent to.

[3416] Specifying the Interface

[3417]FIG. 76A is a flowchart 7650 showing a method for providing aversatile interface. First, in operation 7652, computer code is writtenin a first programming language. Included in the first computer code isreference to second computer code in a second programming language. Seeoperation 7654. In one aspect of the present invention, the reference tothe second computer code may include a predetermined command in thefirst computer code. In yet a further aspect, the second programminglanguage may be either EDIF or VDHL.

[3418] The second computer code is simulated in the second programminglanguage for use during the execution of the first computer code in thefirst programming language. Note operation 7656. In an aspect of thepresent invention, the second computer code may be simulated by a firstsimulator module. In such an aspect, the first simulator module mayinterface a second simulator module. As a further option, the firstsimulator module may interface the second simulator module via a plug-inmodule.

[3419]FIG. 77 illustrates the manner in which an interface 7700 isspecified, in accordance with one embodiment of the present invention.One can specify any particular interface format. This allows he or sheto communicate with code written in another language 7702 such as VHDLor EDIF and allows the Handel-C simulator 7704 to communicate with anexternal plugin program 7706 (e.g., a connection to a VHDL simulator).The expected use for this is to allow one to incorporate bought-in orhandcrafted pieces of low-level code in the high-level Handel-C program.It also allows the Handel-C program code be incorporated within a largeEDIF or VHDL program. One can also use it to communicate with programsrunning on a PC that simulate external devices.

[3420] To use such a piece of code requires that one include aninterface declaration in the Handel-C code to connect it to the externalcode block. This declaration also tells the simulator to call a plugin(which in turn may invoke a simulator for the foreign code).

[3421] Handel-C Code Required

[3422] The code needed in the Handel-C program is in two parts. First, aperson needs an interface declaration. In the simplest form, this is ofthe format:

[3423] interface sort ( { extern_to_HC_port{, extern_to_HC_port} } ) ({HC_to_extern_port{, HC_to_extern_port} } )

[3424] where:

[3425] sort is the name one gives to this type of interface

[3426] extern_to_HC_port is the prototype (type and name) of an inputport used to communicate from the external code to the Handel-C.

[3427] HC_to_extern port is the prototype (type and name) of an outputport used to communicate with the external code from the Handel-C. Atleast one port (input or output) may be declared. One then needs todefine an instance of the interface in the format:

[3428] interface sort( { extern_to_HC_port } )

[3429] Name( { HC_to_extern_port = data from_HC_to_extern

[3430] [with {portSpec}] {,HC_to_extern_port =

[3431] data_from_HC_to_extern

[3432] [with {portSpec} ] } } )

[3433] [ with {extlib=“simulator_pligin”, specs}]

[3434] where:

[3435] sort is the name one gives to this sort of interface

[3436] extern_to_HC_port is the definition of the previously declaredport. This definition may include an optional with specification.

[3437] with {portSpec} is optional. It consists of one or more port

[3438] specifications for a single port in the interface

[3439] name is the name one gives to this definition of the interface

[3440] HC_to_extern_ort is the definition of the previously declaredport. This definition may include a with specification.

[3441] data_from_HC_to_extern is an expression which may be sent to theexternal code from the Handel-C.

[3442] simulator_plugin is the name of a file on the PC which managesthe cosimulation. It provides the inputs to and the data from theexternal code. (This p/ugin file may in turn invoke another simulator.).Its presence is optional.

[3443] specs are instance specifications required (some of these maydepend on the cosimulator file plugin).

[3444] Targeting Specific Tools

[3445] When compiling to EDIF, Handel-C has the capacity to format thenames of wires to external logic according to the different syntaxesused by place and route tools. One can do this using the busformatspecification to a port. This allows one to specify how the bus name andwire number are formatted.

[3446] To specify a format, one uses the syntax:

[3447] port with {busformat = “formatString”}.

[3448] formatstring can be one of the following strings. B representsthe bus name, and 1 represents the wire number.

[3449] B1

[3450] B1

[3451] B[1]

[3452] B(1)

[3453] B<1>

[3454] Example

[3455] interface port_in(int 4 signals_to_HC with

[3456] {busformat=“B[1]) reado;

[3457] would produce wires

[3458] signals_to_HC[0]

[3459] signals_to_HC[1]

[3460] signals_to_HC[2]

[3461] signals_to_HC[3]

[3462] ram unsigned 4 rax[4] with {ports = 1, busformat=“B<1>“};

[3463] would produce wires

[3464] rax_SPPort_addr<0> // Address

[3465] rax_SPPort_addr<1>

[3466] rax_SPPort_data_in<0> // Data In

[3467] rax_SPPort_data in<1>

[3468] rax_SPPort_data in<2>

[3469] rax_SPPort_data_in<3>

[3470] rax_SPPort_data_out<0> // Data Out

[3471] rax_SPPort_data_out<1>

[3472] rax_SPPort_data_out<2>

[3473] rax_SPPort_data_out<3>

[3474] rax_SPPort_data_en // Data Enable

[3475] rax_SPPort_clk // Clock

[3476] rax_SPPort_cs // Chip Select

[3477] rax_SPPort_oe // Output Enable

[3478] rax_SPPort_we // Data In.

[3479] Object Specifications

[3480] Handel-C provides the ability to add ‘tags’ to certain objects(variables, channels, ports, buses, RAMs, ROMs, mprams and signals) tocontrol their behavior. These tags or specifications are listed afterthe declaration of the object using the with keyword. This keyword takesone or more of the following attributes.

[3481]FIGS. 78A through 78C illustrates a table showing thespecification of various keywords 7800, in accordance with oneembodiment of the present invention. The previous sections have alreadyshown briefly how to use some of these specifications but this sectioncovers these in more detail and also describes the other specificationsin the table above.

[3482] Specifications can be added to objects as follows:

[3483] unsigned 4 w with {show=0};

[3484] int 5 x with {show=0, base=2};

[3485] chanout char y with {outfile=“output.dat”};

[3486] chanin int 8 z with {infile=“input.dat”};

[3487] interface bus_clock_in(int 4 in) InBus( ) with

[3488] { pull = 1,

[3489] data = {“P1”, “P2”, “P3”, “P4”}

[3490] };

[3491] When declaring multiple objects, the specification may be givenat the end of the line and applies to all objects declared on that line.For example:

[3492] unsigned x, y with {show=0};

[3493] This attaches the show specification with a value of 0 to both xand y variables.

[3494] Details of each of the specifications are given below.

[3495] The Show Specification

[3496] The show specification may be given to variable, channel, outputbus and tri-state bus declarations. When set to 0, this specificationtells the Handel-C simulator not to list this object in its output. Thismeans that it may not appear in the Variables debug window in the GUI.

[3497] The default value of this specification is 1.

[3498] Reducing the number of items displayed in the output list fromthe simulator produces a noticeable speed up in simulation.

[3499] The Base Specification

[3500] The base specification may be given to variable, output channel,output bus and tri-state bus declarations. The value that thisspecification is set to tells the Handel-C compiler which base todisplay the value of the object in. Valid bases are 2, 8, 10 and 16 forbinary, octal, decimal and hexadecimal respectively. The default valueof this specification is 10.

[3501] The Infile and Outfile Specifications

[3502] The infile specification may be given to chanin, bus_in,bus_latch in, bus_clock_in, bus_ts, bus_ts_latch_in and bus_ts_clock indeclarations. The outfile specification may be given to chanout,bus_out, bus_ts, bus_ts_latch_in and bus_ts_clock_in declarations. Thestrings that these specifications are set to may inform the simulator ofthe file that data should be read from (infile) or the file that datashould be written to (outfile). When applied to a variable, the state ofthat variable at each clock cycle is placed in that file when simulationtakes place. Note that when applying the outfile specification, itshould not be given to multiple variables or channels. For example, thefollowing declarations are not allowed:

[3503] int x, y with {outfile=“out.dat”};

[3504] chanout a, b with {outfile=“out.dat”};

[3505] For details of connecting channels to files. By default, no inputor output files are used.

[3506] The Warn Specification

[3507] The warn specification may be given to a variable, RAM, ROM,channel or bus. When set to zero, certain non-crucial warnings may bedisabled for that object. When set to one (the default value), allwarnings for that object may be enabled.

[3508] warn=0 The speed specification

[3509] The speed specification may be given to an output or tri-statebus. The value of this specification controls the slew rate of theoutput buffer for the pins on the bus. For Xilinx devices, 0 is slow, 3is fast, and the default value is 3. For Altera devices, 0 is slow, 1 isfast, and the default value is 1.

[3510] Refer to the Xilinx or Altera FPGA data sheets for details ofslew rate control.

[3511] The Intime and Outtime Specifications

[3512] The intime specification may be given to an input port or bus,tri-state bus or off-chip memory. The outtime specification may be givento an output port or bus, tri-state bus or off-chip memory. When appliedto Xilinx chips, these specifications cause Handel-C to generate aNetlist Constraints File (NCF) for the design. The place-and-route toolsthen use this file to constrain the relevant paths.

[3513] intime specifies the maximum delay in ns allowed between aninterface or memory and the elements it feeds.

[3514] outtime specifies the maximum delay in ns allowed between aninterface or memory and the elements it is fed from. They can befloating point numbers. For example:

[3515] macro expr memoryPins = {“P6”, “P7”, “P8”,

[3516] “P9”, “P10”, “P11”, “P12”, “P13”};

[3517] macro expr dataPins = {“P1”, “P2”, “P3”, “P4”};

[3518] interface bus_in(unsigned 4) hword( ) with {data = datapins,

[3519] intime = 5};

[3520] interface port_out( )

[3521] (unsigned 4 out = hword.in + 1)

[3522] with {outtime = 5.2};

[3523] ram int 8 a[15][43] with {outtime = 5.2,

[3524] offchip = 1,

[3525] data = memoryPins};

[3526] The Busformat Specification

[3527] The busformat specification may be given to an interface, port ormemory that is resident in external logic. When compiled to EDIF, thebusformat string defines the format of the wire names. Valid values forthe busformat string are:

[3528] B1 B_(—)1 B[1] B(1)

[3529] B represents the bus name and 1 the wire number.

[3530] The default format is B_(—)1

[3531] The Pull Specification

[3532] The pull specification may be given to an input or tri-state bus.When set to 1, a pull up resistor is added to each of the pins of thebus. When set to 0, a pull down resistor is added to each of the pins ofthe bus. When this specification is not given for a bus, no pull up orpull down resistor is used. Altera devices do not have pull-up orpull-down resistors. Refer to the Xilinx FPGA data sheet for details ofpull up and pull down resistors. By default, no pull up or pull downresistors are attached to the pins.

[3533] The Data Specification

[3534] The data specification may be given to an external interface ormemory. It consists of a list of pin numbers separated by commas. If thedata specification is omitted, the place and route tools may assign thepins.

[3535] macro expr memoryPins = {“P6”, “P7”, “P8”,

[3536] “P9”, “P10”, “P11”, “P12”, “P11 3”};

[3537] macro expr datapins = {“P1”, “P2”, “P3”, “P4”};

[3538] interface bus_in(unsigned 4) hword( ) with {data = dataPins,

[3539] intime = 5};

[3540] ram int 8 a[15][43] with { data = memorypins};

[3541] The Offchip Specification

[3542] The offchip specification may be given to a RAM or ROMdeclaration. When set to 1, the Handel-C compiler builds an externalmemory interface for the RAM or ROM using the pins listed in the addr,data, cs, we and oe specifications (see below). When set to 0, theHandel-C compiler builds the RAM or ROM on the FPGA and ignores any pinsgiven with other specifications.

[3543] intime and outtime specifications can also be applied to off-chipRAMs. If they have not been given, the compiler attempts to build themfrom the rate, westart, and welength specifications. If any of these aremissing, the compiler does not calculate time specs for the memory.

[3544] ram int 8 a[5][43] with {offchip = 1};

[3545] The Ports Specification

[3546] The ports specification may be given to a RAM or ROM declaration.When set to 1, the Handel-C compiler builds an external memory interfacefor the RAM or ROM using the ports defined in the addr, data, Cs, we andoe specifications (see below). This allows one to connect to RAMs inexternal code. The compiler generates an error if the ports and offchipspecification are both set to 1 for the same memory. All otherspecifications can be applied. If the ports specification is applied toan MPRAM, a separate interface may be generated for each port.

[3547] The Xilinx Block Specification

[3548] The block specification may be given to a RAM or ROM declaration.One can specify that a block RAM is created in a Xilinx Virtex chip byusing the specification block = 1. E.g.

[3549] ram int 8 a[ 15][43] with {block = 1};

[3550] The default block specification is 0 (not in block memory).

[3551] The Wegate Specification

[3552] The wegate specification may be given to external or internal RAMdeclarations to force the generation of an asynchronous RAM. When set to0, the write strobe may appear throughout the Handel-C clock cycle. Whenset to -1, the write strobe may appear only in the first half of theHandel-C clock cycle. When set to 1, the write strobe may appear only inthe second half of the Handel-C clock cycle.

[3553] The Westart and Welength Specifications

[3554] The westart and welength specifications may be given to internalor external RAM declarations. One can only use these specificationstogether with external_divide or internal_divide clock types with adivision factor greater than 1.

[3555] The westart and welength specifications position the write enablestrobe within the Handel-C clock cycle.

[3556] The rclkpos, wclkpos, clkpulselen and clk Specifications

[3557] The rclkpos, wclkpos and clkpulselen may be given to internal orexternal SSRAM declarations. The clk specification is used for externalSSRAM declarations. To use these specifications, one may be using theexternal_divide or internal_divide clock types with a division factor of2 or more.

[3558] rclkpos specifies the positions of the clock cycles of the newram clock RAMCLK, for a read cycle. These positions would be specifiedin terms of cycles of a fast external clock CLK, counting forwards fromthe rising edge of the Handel-C clock HCLK rising edge.

[3559] wclkpos specifies the positions of the clock cycles of the newram clock RAMCLK, for a write cycle.

[3560] clkpulselen specifies the length of the pulses of the new RAMclock RAMCLK, in terms of cycles of CLK. This is specified only once fora RAM. It thus applies to both the read and write clocks.

[3561] clk specifies the pin(s) that carry the new RAM clock RAMCLK tothe external SSRAM.

[3562] Specifying Pin Outs

[3563] The addr, data, we, cs and oe specifications each take a list ofdevice pins and are used to define the connections between the FPGA andexternal devices. FIG. 78D illustrates the manner in which an pin outsare specified 7850, in accordance with one embodiment of the presentinvention.

[3564] Pin lists are always given in the order most significant to leastsignificant. Multiple write enable, chip select and output enable pinscan be given to allow external RAMs and ROMs to be constructed frommultiple devices. For example, when using two 4-bit wide chips to makean 8-bit wide RAM, the following declaration could be used:

[3565] ram unsigned 8 ExtRAM[256] with {offchip = 1,

[3566] addr={“P1”, “P2”, “P3”, “P4”,

[3567] “P5”, “P6”, “P7”, “P8”};

[3568] data={“P9”, “P10”, “P11”, “P12”,

[3569] “P13”, “P14”, “P15”, “P16”},

[3570] we={“P17”, “P18”},

[3571] cs={“P19”, “P20”},

[3572] oe=“P21”, “P22”}};

[3573] The Rate Specifications

[3574] The rate specification may be given to a clock to specify themaximum delay in ns allowed between components fed from that clock. Thisspecification causes Handel-C to generate a Netlist Constraints File(NCF) for the design. The place-and-route tools then use this file toconstrain the relevant paths so that the part of the design connected tothe clock in question can be clocked at the specified rate. rate may bea floating-point number. For example:

[3575] set clock = external_divide “D17”4 with

[3576] {rate = 1.4};.

[3577] Example Hardware Interface

[3578] An example, theoretical interface is now described to illustratethe use of buses. The scenario is of an external device connected to theFPGA which may be read from or written to. The device has a number ofsignals connected to the FPGA. FIG. 79 illustrates the various signals7900 employed by the present invention.

[3579] A read from the device is performed by waiting for ReadRdy tobecome active (high). The Read signal is then taken high for one clockcycle and the data sampled on the falling edge of the strobe. FIG. 80illustrates a read waveform representative of a cycle 8000, inaccordance with one embodiment of the present invention.

[3580] A write to the device is performed by waiting for WriteRdy tobecome active (high). The Write signal is then taken high for one clockcycle while the data is driven to the device by the FPGA. The devicesamples the data on the falling edge of the Write signal. FIG. 81illustrates a waveform representative of a write cycle 8100, inaccordance with one embodiment of the present invention.

[3581] The first stage of the code may declare the buses associated witheach of the external signals. The following code does this:

[3582] int 4 Data;

[3583] int 1 En = 0;

[3584] interface bus_ts_clock_in(int 4)

[3585] dataB(Data, En==1) with

[3586] {data = {“P1”, “P2”, “P3”, “P4”}};

[3587] int 1 Write = 0;

[3588] interface bus_out( ) writeB(Write) with

[3589] {data = {“P5”}};

[3590] int 1 Read = 0;

[3591] interface bus_out( ) readB(Read) with

[3592] {data = {{“P6”}};

[3593] interface bus_clock_in(int 1)

[3594] WriteReady( ) with {data = {“P7”}};

[3595] interface bus_clock_in(int 1) ReadReady( ) with

[3596] {data = {“P8”}};

[3597] The values on the output buses will now be changed by setting thevalues of the Data, Write and Read variables. In addition, one can drivethe data bus with the contents of Data by setting En to 1. Note that thevariables that drive buses have been initialized to 0 so these variablesmay be static or global. This may be important when driving writestrobes as in the present case. Care should be taken duringconfiguration that the FPGA pins are disconnected in some way from theexternal devices because the FPGA pins become tri-state during thistime.

[3598] The main program reads a word from the external device beforewriting one word back. void main (void) { int 4 Data; // Read word fromexternal device while (ReadReady == 0) delay; Read = 1; // Set the readstrobe par { Data = dataB.in; // Read the bus Read = 0; // Clear theread strobe } // Write one word back to external device Reg = Data + 1;while (WriteReady == 0) delay; par { En = 1; // Drive the bus Write = 1;// Set the write strobe } Write = 0; // Clear the write strobe En = 0;// Stop driving the bus }

[3599] Note that during the write phase, the data bus is driven for oneclock cycle after the write strobe goes low to ensure that the data isstable across the falling edge of the strobe.

[3600] Standard Macro Expressions

[3601] Introduction

[3602] The Handel-C compiler is provided with a standard header filecontaining a collection of useful macro expressions. This header filemay be used by simply including it in the Handel-C program with thefollowing line:

[3603] #include <stdlib.h>

[3604] Note that this header file is not the same as the conventional Cstdlib.h header file but contains a standard collection of definitionsuseful to the Handel-C programmer.

[3605] The definitions themselves are included in the stdlib.liblibrary, which is supplied in the Handel-C\Lib directory. One may ensurethat he or she has included this directory in the library include pathif he or she uses the macro definitions. The following sections describeeach macro in detail.

[3606] Constant Definitions

[3607] The stdlib.h header file contains the following constantdefinitions:

[3608] Constant Name Definition

[3609] TRUE 1

[3610] FALSE 0

[3611] These definitions often lead to cleaner and more readable code.For example:

[3612] int 8 x with { show = FALSE };

[3613] while (TRUE)

[3614] {

[3615] . . .

[3616] }

[3617] if (a==TRUE)

[3618]55

[3619] . . .

[3620] Bit Manipulation Macros

[3621] The stdlib.h header file contains a number of macro expressionsused to manipulate bits and bit fields listed below.

[3622] Adjs

[3623] Usage: adjs( Expression, Width )

[3624] Parameters:

[3625] Expression Expression to adjust (may be signed integer)

[3626] Width Width to adjust to

[3627] Returns:

[3628] Signed integer of width Width.

[3629] Description:

[3630] Adjusts width of signed expression up or down.

[3631] Sign extends MSBs of expression when expanding width. Drops MSBsof expression when reducing width.

[3632] Example:

[3633] int 4 x;

[3634] int 5 y;

[3635] int 6 z;

[3636] y = 15;

[3637] x adjs(y, width(x)); // x 7

[3638] y = −4;

[3639] z = adjs(y, width(z)); // z = −4.

[3640] Adju

[3641] Usage: adju( Expression, Width )

[3642] Parameters:

[3643] Expression Expression to adjust (may be unsigned integer)

[3644] Width Width to adjust to

[3645] Returns:

[3646] Unsigned integer of width Width.

[3647] Description:

[3648] Adjusts width of unsigned expression up or down.

[3649] Zero pads MSBs of expression when expanding width. Drops MSBs ofexpression when reducing width.

[3650] Example:

[3651] unsigned 4 x;

[3652] unsigned 5 y;

[3653] unsigned 6 z;

[3654] y = 14;

[3655] x = adju(y, width(x)); // x = 14

[3656] z = adju(y, width(z)); // z = 14.

[3657] Copy

[3658] Usage: copy( Expression, Count )

[3659] Parameters:

[3660] Expression Expression to copy

[3661] Count Number of times to copy

[3662] Returns:

[3663] Expression duplicated Count times.

[3664] Returned expression is of same type as Expression.

[3665] Returned width is Count * width(Expression).

[3666] Description:

[3667] Duplicates a bit field multiple times.

[3668] Example:

[3669] unsigned 32 x;

[3670] unsigned 4 y;

[3671] y = 0xA;

[3672] x = copy(y, 8); // x = 0xAAAAAAAA.

[3673] Lmo

[3674] Usage: lmo(Expression)

[3675] Parameters:

[3676] Expression Expression to calculate left most one of.

[3677] Returns:

[3678] Bit position of left most one in Expression or width(Expression)if Expression is zero. Return value is log2(width(Expression)) + 1 bitswide.

[3679] Description:

[3680] Finds the position of the most significant 1 bit in anexpression.

[3681] Example:

[3682] int 4 x;

[3683] int 3 y;

[3684] x = 3;

[3685] y = lmo(x); // y = 1

[3686] x = 0;

[3687] y = mo(x); // y = 4;

[3688] lmz

[3689] Usage: lmz(Expression)

[3690] Parameters:

[3691] Expression Expression to calculate left most zero of.

[3692] Returns:

[3693] Bit position of left most zero in Expression or width(Expression)if Expression is all ones.

[3694] Return value is log2(width(Expression))+1 bits wide.

[3695] Description:

[3696] Finds the position of the most significant 0 bit in anexpression.

[3697] Example:

[3698] int 4 x;

[3699] int 3 y;

[3700] x = 3;

[3701] y = lmz(x); // y = 2

[3702] x = 15;

[3703] y = lmz (x); // y = 4;.

[3704] Population

[3705] Usage: population(Expression)

[3706] Parameters:

[3707] Expression Expression to calculate population of.

[3708] Returns:

[3709] Value of same type as Expression.

[3710] Description:

[3711] Counts the number of 1 bits (population) in Expression.

[3712] Example:

[3713] int 4 x;

[3714] int 3 y;

[3715] x = 0b1011;

[3716] y = population(x); // y 3.

[3717] Rmo

[3718] Usage: rmo(Expression)

[3719] Parameters:

[3720] Expression Expression to calculate right most one of.

[3721] Returns:

[3722] Bit position of right most one in Expression or width(Expression)if Expression is zero. Return value is log2(width(Expression))+1 bitswide.

[3723] Description:

[3724] Finds the position of the least significant 1 bit in anexpression.

[3725] Example:

[3726] int 4 x;

[3727] int 3 y;

[3728] x = 3;

[3729] y=rmo(x); // y = 0

[3730] x = 0;

[3731] y = rmo (x); // y = 4;.

[3732] Rmz

[3733] Usage: rmz(Expression)

[3734] Parameters:

[3735] Expression Expression to calculate right-most zero of.

[3736] Returns:

[3737] Bit position of right most zero in Expression orwidth(Expression) if Expression is all ones.

[3738] Return value is log2(width(Expression))+1 bits wide.

[3739] Description:

[3740] Finds the position of the least significant 0 bit in anexpression.

[3741] Example:

[3742] int 4 x;

[3743] int 3 y;

[3744] x = 3;

[3745] y = rmz(x); // y = 2

[3746] x = 15;

[3747] y = rmz(x); // y = 4;.

[3748] Top

[3749] Usage: top(Expression, Width)

[3750] Parameters:

[3751] Expression Expression to extract bits from.

[3752] Width Number of bits to extract.

[3753] Returns:

[3754] Value of same type as Expression.

[3755] Description:

[3756] Extracts the most significant Width bits from an expression.

[3757] Example:

[3758] int 32 x;

[3759] int 8 y;

[3760] x = 0x12345678;

[3761] y = top(x, width(y)); // y = 0x12.

[3762] Arithmetic Macros

[3763] The stdlib.h header file contains a number of macro expressionsfor mathematical calculations listed below.

[3764] Abs

[3765] Usage: abs(Expression

[3766] Parameters:

[3767] Expression Signed expression to get absolute value of.

[3768] Returns:

[3769] Signed value of same width as Expression.

[3770] Description:

[3771] Obtains the absolute value of an expression.

[3772] Example:

[3773] int 8 x;

[3774] int 8 y;

[3775] x = 34;

[3776] y = −18;

[3777] x = abs(x); // x = 34

[3778] y = abs(y); // y = 18.

[3779] Addsat

[3780] Usage: addsat(Expression 1, Expression2)

[3781] Parameters:

[3782] Expression1 Unsigned operand 1.

[3783] Expression2 Unsigned operand 2. May be of same width asExpression1.

[3784] Returns:

[3785] Unsigned value of same width as Expression1 and Expression2.

[3786] Description:

[3787] Returns sum of Expression1 and Expression2. Addition is saturatedand result may not be greater than maximum value representable in thewidth of the result.

[3788] Example:

[3789] unsigned 8 x;

[3790] unsigned 8 y;

[3791] unsigned 8 z;

[3792] x = 34;

[3793] y = 18;

[3794] z = addsat(x, y); // z = 52

[3795] x = 34;

[3796] y = 240;

[3797] z = addsat(x, y); // z = 255.

[3798] Decode

[3799] Usage: decode(Expression)

[3800] Parameters:

[3801] Expression Unsigned operand.

[3802] Returns:

[3803] Unsigned value of width 2 width(Expression

[3804] Description:

[3805] Returns 2 Expression.

[3806] Example:

[3807] unsigned 4 x;

[3808] unsigned 16 y;

[3809] x = 8;

[3810] y = decode(x); // y = 0b100000000.

[3811] div

[3812] Usage: div(Expressionl, Expression2)

[3813] Parameters:

[3814] Expression1 Unsigned operand 1.

[3815] Expression2 Unsigned operand 2. May be of the same width asExpression 1.

[3816] Returns:

[3817] Unsigned value of same width as Expression1 and Expression2.

[3818] Description:

[3819] Returns integer value of Expression1/Expression2.

[3820] Example:

[3821] unsigned 8 x;

[3822] unsigned 8 y;

[3823] unsigned 8 z;

[3824] x = 56;

[3825] y = 6;

[3826] z = div(x,y); //z = 9

[3827] Warning! Division requires a large amount of hardware and shouldbe avoided unless absolutely necessary.

[3828] exp2

[3829] Usage: exp2(Constant)

[3830] Parameters:

[3831] Constant Operand.

[3832] Returns:

[3833] Constant of width width(Constant)+1.

[3834] Description:

[3835] Used to calculate 2 Constant. Similar to decode but may be usedwith constants of undefined width.

[3836] Example

[3837] unsigned 4 x;

[3838] unsigned (exp2(width(x))) y; // y of width 16

[3839] incwrap

[3840] Usage: incwrap(Expression1, Expression2)

[3841] Parameters:

[3842] Expression1 Operand 1.

[3843] Expression2 Operand 2. May be of same width as Expression1.

[3844] Returns:

[3845] Value of same type and width as Expression1 and Expression2.

[3846] Description:

[3847] Used to increment a value with wrap around at a second value.Returns Expression1+1 or 0 if Expression1+1 is equal to Expression2.

[3848] Example:

[3849] unsigned 8 x;

[3850] x = 74;

[3851] x = incwrap(x, 76); // x = 75

[3852] x = incwrap(x, 76); // x = 0

[3853] x = incwrap(x, 76); // x = 1

[3854] log2ceil

[3855] Usage: log2ceil(Constant)

[3856] Parameters:

[3857] Constant Operand.

[3858] Returns:

[3859] Constant value of ceiling(log2(Constant)).

[3860] Description:

[3861] Used to calculate log2 of a number and rounds the result up.Useful to determine the width of a variable needed to contain aparticular value.

[3862] Example:

[3863] unsigned (log2ceil(5768)) x; // x 13 bits wide

[3864] unsigned 8 y;

[3865] y = log2ceil(8); // y = 3

[3866] y = log2ceil(7); // y = 3

[3867] log2floor

[3868] Usage: log2floor(Constant)

[3869] Parameters:

[3870] Constant Operand.

[3871] Returns:

[3872] Constant value of floor(log2(Constant)).

[3873] Description:

[3874] Used to calculate log2 of a number and rounds the result down.

[3875] Example

[3876] unsigned 8 y;

[3877] y = log2floor(8); // y = 3

[3878] y = log2floor(7); // y = 2.

[3879] Mod

[3880] Usage: mod(Expression1, Expression2)

[3881] Parameters:

[3882] Expression1 Unsigned operand 1.

[3883] Expression2 Unsigned operand 2. May be of the same width asExpression1.

[3884] Returns:

[3885] Unsigned value of same width as Expression1 and Expression2.

[3886] Description:

[3887] Returns remainder of Expression1 divided by Expression2

[3888] Example:

[3889] unsigned 8 x;

[3890] unsigned 8 y;

[3891] unsigned 8 z;

[3892] x = 56;

[3893] y = 6;

[3894] z = mod(x, y); // z = 2

[3895] Warning! Modulus arithmetic requires a large amount of hardwareand should be avoided unless absolutely necessary.

[3896] Sign

[3897] Usage: sign(Expression)

[3898] Parameters:

[3899] Expression Signed operand.

[3900] Returns:

[3901] Unsigned integer 1-bit wide.

[3902] Description:

[3903] Used to obtain the sign of an expression. Returns zero ifExpression is positive or one if Expression is negative.

[3904] Example:

[3905] int 8 y;

[3906] unsigned 1 z;

[3907] y = 53;

[3908] z = sign(y); // z = 0

[3909] y = −53;

[3910] z = sign(y); // z = 1

[3911] subsat

[3912] Usage: subsat(Expression1, Expression2)

[3913] Parameters:

[3914] Expression1 Unsigned operand 1.

[3915] Expression2 Unsigned operand 2. May be of same width asExpression1.

[3916] Returns:

[3917] Unsigned value of same width as Expression1 and Expression2.

[3918] Description:

[3919] Returns difference between Expression1 and Expression2.Subtraction is saturated and result may not be less than 0.

[3920] Example:

[3921] unsigned 8 x;

[3922] unsigned 8 y;

[3923] unsigned 8 z;

[3924] x = 34;

[3925] y = 18;

[3926] z = subsat(x, y); // z = 16

[3927] x = 34;

[3928] y = 240;

[3929] z subsat(x, y); // z = 0.13 Clocks

[3930] Multiple Clocks

[3931] One can have multiple clocks interfacing with the design. Eachmain( ) function may be associated with a clock.

[3932] Internal Clocks

[3933] One can set the clock to be any expression or any expressiondivided by a given factor. For Xilinx 4000 series chips, he or she canset the clock to be a value read from the on-chip clock generator.

[3934] set clock = internal <Expression>;

[3935] set clock = internal_divide <Expression> factor;

[3936] This allows one to set the clock to a value read from aninterface.

[3937] Example

[3938] interface port_in(unsigned 1 clk) ClockPort( );

[3939] set clock = internal ClockPort.clk;

[3940] External Clocks

[3941] External clocks may be accessed by associating the clock with aspecific pin using set clock external = “pin_Name” or set clockexternal_divide = “pin Name” factor.

[3942] The pin_Name string is optional. If it is omitted, the pins areunconstrained and the place and route tools can assign the pin. One canalso define an interface that reads an external clock. If the clock isassociated with a specific pin, one can use the interface sort bus_in.One would only need to do this if the external clock has been divided,otherwise he or she can use the intrinsic __clock (see below).

[3943] Example

[3944] interface bus_in(unsigned 1 in) InputBus( )

[3945] with {data={“Pin1”}};

[3946] set clock = internal-divide InputBus.in 3;

[3947] One can now use InputBus.in to get an undivided external clock.It may be more efficient to omit the pin specification and allow theplace and route tools to assign the pin.

[3948] interface bus-in(unsigned 1 in) InputBus( );

[3949] set clock = internal-divide InputBus.in 3;.

[3950] Current clock

[3951] The current clock used by a function can be referenced using thekeyword_(—)_clock. This allows the function to pass the current clock toan external interface. The value of the system variable_clock may be thevalue after any divide. The clock may be an internal or an externalclock.

[3952] Example

[3953] The code below shows the current clock in an interface.

[3954] interface reg32x1k( ) registers(address, data_in,_clock, write)

[3955] with {extlib=“PluginModelSim.dll”,

[3956] extinst=“1; model = reg32x1k_wrapper; clock=ck:25”);

[3957] Communicating Between Clock Domains

[3958] It is not legal to access the same variable from different clockdomains. Instead, one may transmit data between clock domains using achannel or a port.

[3959] Channels

[3960] Channels that connect between clock domains may beuni-directional point-to-point. This means that their first use definestheir direction and the domains in which they transmit and receive. Ifone attempts to re-use the channel in a different direction or to orfrom a different clock domain the compiler generates an error. Channelsused between clock domains may be declared in one file and thenreferenced as extern in another. The timing between domains isunspecified, but the transmission is guaranteed to occur, and both sidesmay wait until the transmission has completed. For example: //File:transmit.c chan 8 C; // channel may have global scope main( ) { int 8 x,y; c ! x; //program may wait until data //successfully transmitted c !y; } //File: receive.c extern chan c; main( ) { int 8 p, q; c ? p; c ?q;

[3961] Multi-File Projects.

[3962] Introduction

[3963] One can combine multiple files in a single project. The projectcan have a single main function or several. If there are multiple mainfunctions within a single project, they can be loaded onto the samechip. Each main function can be associated with a different clock. Theproject can include libraries (pre-compiled Handel-C code) and blocks offoreign code (e.g. VHDL). EDIF and VHDL linking is done by synthesistools.One can refer to functions, macros and shared expressions thathave been defined in another file by prototyping them. One prototype bydeclaring an object at the top of the file in which it is used.

[3964] Function prototypes are in the following format:

[3965] returnTypefunctionName(parameterTypeList);

[3966] Macro prototypes are like this:

[3967] macro expr Name(parameterList);

[3968] macro proc Name(parameterList);

[3969] Functions and macros may be static or extern. static functionsand macros may only be used in the file where they are defined.

[3970] One can collect all the prototypes into a single header file andthen #include it within the code files.

[3971] One can access variables declared in other files by using theextern keyword.

[3972] One cannot use variables to communicate between clock domains.Variables are restricted to a single clock domain. The only items thatcan connect across separate clock domains are channels and MPRAMs

[3973] Language Summary

[3974] Introduction

[3975] This section summarizes the previous sections by listing all theHandel-C types, statements and operators.

[3976] Type Summary

[3977]FIG. 82 illustrates a table that lists the most common types thatmay be associated with a variable 8200, in accordance with oneembodiment of the present invention. FIG. 83 illustrates a table thatlists all prefixes to the above types for different architectural objecttypes 8300, in accordance with one embodiment of the present invention.

[3978] Statement Summary

[3979]FIG. 84 illustrates a table that lists all statements in theHandel-C language 8400, in accordance with one embodiment of the presentinvention. The following table lists all statements in the Handel-Clanguage. Note that the assignment group of operations and the incrementand decrement operations are included as statements to reflect the factthat Handel-C expressions cannot contain side effects.

[3980] Operator Summary

[3981]FIGS. 85A and 85B illustrate a table that lists all operators inthe Handel-C language 8500, in accordance with one embodiment of thepresent invention. In this table, entries at the top have the highestprecedence and entries at the bottom have the lowest precedence. Entrieswithin the same group have the same precedence. Note that function callsand assignments are not true operators in Handel-C.

[3982] Complete Language Syntax

[3983] Introduction

[3984] In this section of the present description, the complete Handel-Clanguage syntax may be given in BNF-like notation.

[3985] Keyword Summary

[3986]FIGS. 86A through 86E illustrate a table that lists keywords 8600,in accordance with one embodiment of the present invention. The keywordslisted below are reserved and cannot be used for any other purpose.Keywords not in ISO-C are in bold. The following character sequences arealso reserved: * */ // # ″ ′.

[3987] Complete Language Syntax

[3988] The conventions used in this language reference are:

[3989] Terminal symbols are set in typewriter font like this.

[3990] Non-terminal symbols are set in italic font like this.

[3991] Square brackets [ . . . ] denote optional components.

[3992] Braces { . . . } denotes zero, one or more repetitions of theenclosed components.

[3993] Braces with a trailing plus sign { . . . } + denote one orseveral repetitions of the enclosed components.

[3994] Parentheses ( . . . ) denote grouping.

[3995] Identifiers

[3996] Identifiers are sequences of letters, digits and _, starting witha letter. All characters in an identifier are meaningful and allidentifiers are case sensitive.

[3997] identifier ::= letter [letter | 0 . . . 9}

[3998] letter ::= A . . . Z | a . . . z |_(—)

[3999] Integer Constant

[4000] integer-constant ::= [−]{1 . . . 9} + {0 . . . 9}

[4001] | [−](0x | 0X){0 . . . 9 | A . . . F | a . . . f} +

[4002] | [−](0){0 . . . 7}

[4003] | [−](0b | 0B){0 . . . 1} +

[4004] Character Constants

[4005]FIG. 87A illustrates escape codes and their associated meanings8700, in accordance with one embodiment of the present invention.Character is any printable character or any of the following escapecodes.

[4006] Strings

[4007] string ::= {character}”

[4008] Floating Point Constants

[4009] float_constant::=

[4010] [{0 . . . 9}+].{0 . . . 9}+[(e | E)[+|−]{0 . . . 9}+][f | F | l |L]

[4011] | {0 . . . 9}+.[(e | E)[+|−]{0 . . . 9}+][f | F |l | L]

[4012] | {0 . . . 9}+(e | E)[+|−]{0 . . . 9}+[f | F | l | L]

[4013] Overview

[4014] external_declaration ::= function_definition

[4015] | declaration

[4016] | set_statement;

[4017] Functions and Declarations

[4018] function_definition ::= declaration_specifiers declaratorcompound_statement [with initialiser ;]

[4019] | declarator compound_statement [ with initialiser ;]

[4020] declaration ::= declaration_specifiers [ init_declarator_list][with initialiser ];

[4021] | interface_declaration

[4022] | macro_declaration

[4023] declaration_specifiers ::= storage_class_specifier [declaration_specifiers]

[4024] | type_specifier [ declaration_specifiers]

[4025] | type_qualifier [ declaration_specifiers]

[4026] storage_class_specifier ::= auto

[4027] | register

[4028] | inline

[4029]51 typedef

[4030] |extern

[4031] | static

[4032] type_specifier ::= void

[4033] | char

[4034] | short

[4035] | int

[4036] |long

[4037] | float

[4038] | double

[4039] | signed.

[4040] | unsigned

[4041] | typeof (expression)

[4042] | signal_specifier

[4043] | channel_specifier

[4044] | ram_specifier

[4045] | struct_or_union_specifier

[4046] | enum_specifier

[4047] | typedef_name

[4048] type_qualifier ::= const | volatile

[4049] typedef_name::= identifier

[4050] init_declarator_list ::= declarator [= initialiser] { ,declarator[= initialiser]}

[4051] Macro/Shared exprs/procs

[4052] macro_declaration ::= macro_proc_decl

[4053] | macro_expr_decl

[4054] macro_proc_decl ::= [ static | extern] macro_proc_spec identifier

[4055] [ ([macro_param{, macro_param} ] )]

[4056] statement

[4057] [ with initialiser ;]

[4058] macro_expr_decl ::= [ static | extern] macro_expr_spec identifier

[4059] [ ([macro_param{, macro_param}]

[4060] )];

[4061] | [ static | extern] macro_expr_spec identifier

[4062] [ ([macro_param{, macro_param} ] )] = let_initialiser

[4063] [with initialiser ];

[4064] macro_proc_spec ::= macro proc

[4065] | shared proc

[4066] macro_expr_spec ::= macro expr

[4067] | shared expr

[4068] let_initialiser ::= initialiser

[4069] | let macro_expr_decl in let_initialiser

[4070] macro_param ::= identifier

[4071] Interfaces

[4072] interface_declaration ::= interface identifier ([int_parameter_declaration

[4073] { , int_parameter_declaration} ] )

[4074] identifier ([ assignment_expr_spec {,

[4075] assignment_expr_spec} ] ) [with

[4076] initialiser];

[4077] I interface_type_declarator

[4078] | old_style_interface_declarator

[4079] interface_type_declarator ::= interface identifier ([int_parameter_proto

[4080] {, int_parameter_proto}] )

[4081] identifier ([int_init_parameter_declaration {,

[4082] int_init_parameter_declaration} ]

[4083] )

[4084] This format is deprecated but retained for compatibility reasons

[4085] old_style_interface_declarator ::= interface identifier ( [

[4086] int_parameter_declaration

[4087] {,int_parameter_declaration}] )

[4088] identifier ([assignment_expr_spec {,

[4089] assignment_expr_spec})

[4090] [with initialiser ];

[4091] interface ::= [ static | extern] interface

[4092] int_parameter_proto::= declaration_specifiers

[4093] | declaration_specifiers declarator

[4094] | declaration_specifiers abstract_declarator

[4095] | declaration_specifiers width

[4096] int_parameter_declaration ::= declaration_specifiers [withinitialiser ]

[4097] | declaration_specifiers declarator [with initialiser ]

[4098] | declaration_specifiers abstract_declarator [with initialiser ]

[4099] | declaration_specifiers width [with initialiser ]

[4100] int_init_parameter_declaration ::= int_parameter_declaration

[4101] | declaration_specifiers declarator [ = initialiser] [withinitialiser ]

[4102] assignment_expr_spec ::= assignment_expression [with initialiser]

[4103] Structures and Unions

[4104] struct_or_union_specifier ::= aggregate_form [ identifier] {

[4105] {struct_declaration}+ }

[4106] | aggregate_form identifier

[4107] aggregate_form ::= struct

[4108] | union

[4109] | mpram

[4110] struct_declaration ::= { type_specifier | type_qualifier}+

[4111] {struct_declarator}+[with initialiser ];

[4112] struct_declarator ::= declarator

[4113] | [declarator]: constant_expression

[4114] Enumerated Types

[4115] enum_specifier ::= enum [ identifier] { enumerator {,[enumerator] }

[4116] | enum identifier

[4117] enumerator ::= identifier

[4118] | identifier = constant_expression

[4119] Signal Specifiers

[4120] signal_specifier ::= signal<type-name >

[4121] | signal

[4122] Channel Specifiers

[4123] channel_specifier ::= chan [<type name> ]

[4124] | chanin [ <type_name >]

[4125] | chanout [<type name > ]

[4126] Ram Specifiers

[4127] ram specifier ::= ram [ <type_name > ]

[4128] | rom [ <type name > ]

[4129] | wom [ <type_(name > ])

[4130] Declarators

[4131] declarator ::= [width] pointer direct_declarator

[4132] width ::= undefined

[4133] | primary_expression

[4134] direct_declarator ::= identifier

[4135] | (pointer direct_declarator )

[4136] | direct_declarator [ [constant_expression] ]

[4137] | direct_declarator ( [ {parameter_declaration}+ ])

[4138] pointer ::= *

[4139] | * type_qualifier

[4140] | * pointer

[4141] | * type_qualifier pointer

[4142] Function Parameters

[4143] parameter_declaration ::= declaration_specifiers

[4144] | declaration_specifiers width

[4145] | declaration_specifiers abstract_declarator

[4146] | declaration_specifiers declarator

[4147] Type Names and Abstract Declarators

[4148] type name ::= { type_specifier | type_qualifier}+

[4149] | { type_specifier | type_qualifier]+ abstract_declarator

[4150] | { type_specifier | type_qualifier}+ width

[4151] abstract_declarator ::= [width] pointerdirect_abstract_declarator

[4152] direct_abstract_declarator ::= ( pointerdirect_abstract_declarator ) −| [direct_abstract_declarator][[constant_expression] ]

[4153] | [direct_abstract_declarator] ( [ {parameter_declaration}+ ]

[4154] )

[4155] Statements

[4156] statement ::= semi_statement;

[4157] | non_semi_statement

[4158] semi_statement ::= expression_statement

[4159] | do statement while ( expression )

[4160] | jump_statement

[4161] | assert ( constant_expression [, assignment_expression{,

[4162] assignment_expression}] )

[4163] | delay

[4164] | channel_statement

[4165] | set_statement

[4166] non_semi_statement ::= labeled_statement

[4167] | compound_statement

[4168] | selection_statement

[4169] | iteration_statement.

[4170] The following statements can appear in for start/end conditions

[4171] for_statement ::= non_semi_statement

[4172] | expression_statement

[4173] | do statement while ( expression )

[4174] | assert ( constant_expression [,assignment_expression{,

[4175] assignment expression}] )

[4176] | delay

[4177] | channel_statement

[4178] These are the statements that can appear in prialt blocks/

[4179] prialt_statement ::= semi_statement;

[4180] | non_semi_prialt_statement

[4181] non_semi_prialt_statement ::= prialt_labeled_statement

[4182] | compound_statement

[4183] | selection_statement

[4184] | iteration_statement

[4185] labeled_statement ::= identifier : statement

[4186] | case constant_expression : statement

[4187] | default : statement

[4188] prialt_labeled_statement ::= identifier : prialt_statement

[4189] | case channel_statement : prialt_statement

[4190] | default: prialt_statement

[4191] expression_statement ::= [expression]

[4192] channel_statement ::= unary_expression ! expression

[4193] | logical_or_expression ? expression

[4194] jump_statement ::= goto identifier

[4195] | continue

[4196] | break

[4197] | return

[4198] | return expression

[4199] selection_statement ::= if ( expression ) statement %oprec if

[4200] | if ( expression ) statement else statement

[4201] | ifselect ( constant_expression ) statement %prec if

[4202] | ifselect ( constant_expression ) statement else statement.

[4203] | switch ( expression ) statement

[4204] | prialt { [{prialt_statement}+] }

[4205] set_statement ::= set part = STRING

[4206] | set clock = clock

[4207] | set family = identifier

[4208] | set intwidth = constant_expression

[4209] | set intwidth = undefined

[4210] clock ::= internal expression [with initialiser]

[4211] | external expression [with initialiser ]

[4212] | internal_divide expression expression [with initialiser ]

[4213] | external_divide expression expression [with initialiser ]

[4214] iteration_statement ::= while ( expression ) statement

[4215] | for ( [for_statement]; [ expression]; [for_statement] )

[4216] statement

[4217] Compound Statements with Replicators

[4218] compound_statement ::= [seq | par] {{ declaration} {statement} }

[4219] | [seq | par] ( [repl_macro_param{, repl_macro_param}];

[4220] constant expression;

[4221] [repl_update_param {, repl_update_param}] )

[4222] {{declaration} {statement} }

[4223] Replicator Rules

[4224] Replicator Initialisation Definitions

[4225] repl_macro_param ::= repl_param = initialiser

[4226] | (repl_param = initialiser )

[4227] Replicator Update Definitions

[4228] repl_update_param ::= repl_update_param_body

[4229] | (repl_update_param )

[4230] repl_update_param_body ::= repl_param assignment_operatorinitialiser

[4231] | ++ repl_param

[4232] | repl_param ++

[4233] | -- repl_param

[4234] | repl_param −

[4235] | repl_param ::= identifier

[4236] | (repliparam).

[4237] Expressions

[4238] constant_expression ::= assignment expression

[4239] expression ::= assignment expression

[4240] | expression, assignment_expression}

[4241] assignment_expression ::= conditional_expression

[4242] | unary_expression assignment_operator assignment_expression

[4243] assignment_operator ::= = | *= | /= | %= | += | −=| <<= | >>= |&= |X|= | |=

[4244] initialiser ::= assignment expression

[4245] conditional expression ::= logical_or_expression

[4246] | logical_or_expression ? expression : conditional_expression

[4247] logical_or_expression ::= logical_and_expression

[4248] | logical_or_expression ∥ logical_and_expression

[4249] logical_and_expression ::= inclusive_or_expression

[4250] | logical_and_expression && inclusive_or_expression

[4251] inclusive_or_expression ::= exclusive_or_expression

[4252] | inclusive_or_expression | exclusive_or_expression

[4253] exclusive_or_expression ::= and_expression

[4254] | exclusive_or_expression ^ and_expression

[4255] and_expression ::= equality_expression

[4256] | and_expression & equality_expression

[4257] equality_expression ::= relational_expression

[4258] | equality_expression == relational_expression

[4259] | equality_expression != relational_expression

[4260] relational_expression ::= cat_expression

[4261] | relational_expression < cat_expression

[4262] | relational_expression > cat_expression

[4263] | relational_expression <= cat_expression

[4264] | relational_expression >= cat_expression

[4265] cat_expression ::= shift_expression.

[4266] | cat_expression @ shift_expression

[4267] shift_expression ::= additive_expression

[4268] | shift_expression << additive_expression

[4269] | shift_expression >> additive_expression

[4270] additive_expression ::= multiplicative_expression

[4271] | additive_expression + multiplicative_expression

[4272] | additive_expression − multiplicative_expression

[4273] multiplicative_expression ::= take_drop_expression

[4274] | multiplicative_expression * take_drop_expression

[4275] | multiplicative_expression / take_drop_expression

[4276] | multiplicative_expression % take_drop_expression

[4277] take_drop_expression ::= cast_expression

[4278] | take_drop_expression <- cast_expression

[4279] | take_drop_expression \\ cast_expression

[4280] cast_expression ::= unary_expression

[4281] | ( type_name ) cast_expression

[4282] unary_expression ::= postfix_expression

[4283] | ++ unary_expression

[4284] | −− unary_expression

[4285] | unary_operator cast_expression

[4286] | sizeof unary_expression

[4287] | sizeof ( type_name )

[4288] | width ( expression )

[4289] unary_operator ::= & | + | − | ˜ | ! | 8

[4290] postfix_expression ::= select_expression

[4291] | posfix_expression [ expression ]

[4292] | postfix_expression [ expression : expression ]

[4293] | postfix_expression [ : expression ]

[4294] | posfix_expression [ expression : ]

[4295] | postfix_expression [ ]

[4296] | posfix_expression ( [assignment_expression

[4297] {, assignment_expression}] )

[4298] | posifix_expression . identifier

[4299] | postfix_expression -> identifier

[4300] | postfix_expression ++ .|postfix_expression −−

[4301] select_expression ::= primary_expression

[4302] | select ( constant_expression , constant_expression ,

[4303] constant_expression )

[4304] primary_expression ::= identifier

[4305] | constant

[4306] | ( expression )

[4307] | { }

[4308] | {[initialiser {, initialiser}[, ] ]}

[4309] constant ::= integer_constant

[4310] | character_constant

[4311] | string_constant

[4312] integer_constant ::= NUMBER

[4313] character_constant ::= CHARACTER

[4314] string_constant ::= STRING

[4315] Program

[4316] The overall syntax for the program is:

[4317] program ::= (global_declaration}

[4318] void main(void) {

[4319] {declaration}

[4320] {statement}

Preprocessor

[4321] Introduction

[4322] Handel-C is a programming language designed to enable thecompilation of programs into synchronous hardware. Handel-C is not ahardware description language though; rather it is a programminglanguage aimed at expressing algorithms from a high level.

[4323] This second describes the Handel-C preprocessor. The Handel-Ccompiler may invoke the preprocessor automatically each time it compilesa program.

[4324] The GNU Preprocessor

[4325] Handel-C does not use its own preprocessor. Rather it uses theGNU preprocessor written by the Free Software Foundation for the gcc Ccompiler. Since this section simply contains the text for the GNU CPreprocessor, not all statements may be relevant to the Handel-Ccompiler. For example, the section detailing the CPU predefined macrosdoes not apply because the Handel-C program may not be executing on aprocessor at all.

[4326] C Preprocessor

[4327] Introduction

[4328] The C preprocessor is a macro processor that is usedautomatically by the C compiler to transform the program before actualcompilation. It is called a macro processor because it allows one todefine macros, which are brief abbreviations for longer constructs.

[4329] The C preprocessor provides four separate facilities that one canuse as he or she sees fit:

[4330] Inclusion of header files. These are files of declarations thatcan be substituted into the program.

[4331] Macro expansion. One can define macros, which are abbreviationsfor arbitrary fragments of C code, and then the C preprocessor mayreplace the macros with their definitions throughout the program.

[4332] Conditional compilation. Using special preprocessing directives,one can include or exclude parts of the program according to variousconditions.

[4333] Line control. If one uses a program to combine or rearrangesource files into an intermediate file which is then compiled, he or shecan use line control to inform the compiler of where each source lineoriginally came from.

[4334] C preprocessors vary in some details. This second discusses theGNU C preprocessor, the C Compatible Compiler Preprocessor.

[4335] The GNU C preprocessor provides a superset of the features ofANSI Standard C. ANSI Standard C requires the rejection of many harmlessconstructs commonly used by today's C programs. Such incompatibilitywould be inconvenient for users, so the GNU C preprocessor is configuredto accept these constructs by default. Strictly speaking, to get ANSIStandard C, one may use the options ‘-trigraphs’, ‘-undef’ and‘-pedantic’, but in practice the consequences of having strict ANSIStandard C make it undesirable to do this.

[4336] Transformations Made Globally

[4337] Most C preprocessor features are inactive unless one givesspecific directives to request their use. But there are threetransformations that the preprocessor always makes on all the input itreceives, even in the absence of directives.

[4338] All C comments are replaced with single spaces.

[4339] Backslash-Newline sequences are deleted, no matter where. Thisfeature allows one to break long lines for cosmetic purposes withoutchanging their meaning.

[4340] Predefined macro names are replaced with their expansions.

[4341] The first two transformations are done before nearly all otherparsing and before preprocessing directives are recognized. Thus, forexample, one can split a line cosmetically with Backslash-Newlineanywhere (except when trigraphs are in use; see below).

[4342] /*

[4343] */#/*

[4344] */defi\

[4345] ne FO\

[4346] O 10\

[4347] 20

[4348] is equivalent to ‘#define FOO 1020’. One can split even an escapesequence with Backslash-Newline. For example, one can split “foo\bar”between the ‘\’ and the ‘b’ to get

[4349] “foo\\

[4350] bar”

[4351] This behavior is unclean: in all other contexts, a Backslash canbe inserted in a string constant as an ordinary character by writing adouble Backslash, and this creates an exception. But the ANSI C standardrequires it. (Strict ANSI C does not allow Newlines in string constants,so they do not consider this a problem.)

[4352] But there are a few exceptions to all three transformations.

[4353] C comments and predefined macro names are not recognized inside a‘#include’ directive in which the file name is delimited with ‘<’ and‘>’.

[4354] C comments and predefined macro names are never recognized withina character or string constant. (Strictly speaking, this is the rule,not an exception, but it is worth noting here anyway.)

[4355] Backslash-Newline may not safely be used within an ANSI“trigraph”. Trigraphs are converted before Backslash-Newline is deleted.If one writes what looks like a trigraph with a Backslash-Newlineinside, the Backslash-Newline is deleted as usual, but it is then toolate to recognize the trigraph.

[4356] This exception is relevant only if one use the ‘-trigraphs’option to enable trigraph processing.

[4357] Preprocessing Directives

[4358] Most preprocessor features are active only if one usespreprocessing directives to request their use. Preprocessing directivesare lines in the program that start with ‘#’. The ‘#’ is followed by anidentifier that is the directive name. For example, ‘#define’ is thedirective that defines a macro. Whitespace is also allowed before andafter the ‘#’.

[4359] The set of valid directive names is fixed. Programs cannot definenew preprocessing directives.

[4360] Some directive names require arguments; these make up the rest ofthe directive line and may be separated from the directive name bywhitespace. For example, ‘#define’ may be followed by a macro name andthe intended expansion of the macro.

[4361] A preprocessing directive cannot be more than one line in normalcircumstances. It may be split cosmetically with Backslash-Newline, butthat has no effect on its meaning. Comments containing Newlines can alsodivide the directive into multiple lines, but the comments are changedto Spaces before the directive is interpreted. The only way asignificant Newline can occur in a preprocessing directive is within astring constant or character constant. Note that most C compilers thatmight be applied to the output from the preprocessor do not acceptstring or character constants containing Newlines.

[4362] The ‘#’ and the directive name cannot come from a macroexpansion. For example, if ‘foo’ is defined as a macro expanding to‘define’, that does not make ‘#foo’ a valid preprocessing directive.

[4363] Header Files

[4364] Introduction

[4365] A header file is a file containing C declarations and macrodefinitions to be shared between several source files. A person mayrequest the use of a header file in the program with the C preprocessingdirective ‘#include’.

[4366] Uses of Header Files

[4367] Header files serve two kinds of purposes.

[4368] System header files declare the interfaces to parts of theoperating system. A person may include them in the program to supply thedefinitions and declarations he or she needs to invoke system calls andlibraries.

[4369] The header files contain declarations for interfaces between thesource files of the program. Each time a person has a group of relateddeclarations and macro definitions all or most of which are needed inseveral different source files, it is a good idea to create a headerfile for them.

[4370] Including a header file produces the same results in Ccompilation as copying the header file into each source file that needsit. But such copying would be time-consuming and error-prone. With aheader file, the related declarations appear in only one place. If theyneed to be changed, they can be changed in one place, and programs thatinclude the header file may automatically use the new version when nextrecompiled. The header file eliminates the labor of finding and changingall the copies as well as the risk that a failure to find one copy mayresult in inconsistencies within a program.

[4371] The usual convention is to give header files names that end with‘.h’. Avoid unusual characters in header file names, as they reduceportability.

[4372] The ‘#include’ Directive

[4373] Both user and system header files are included using thepreprocessing directive ‘#include’. It has three variants:

[4374] #include <file>

[4375] This variant is used for system header files. It searches for afile named file in a list of directories specified by you, then in astandard list of system directories. One may specify directories tosearch for header files with the command option ‘-I’. The option‘-nostdinc’ inhibits searching the standard system directories; in thiscase only the directories one specifies are searched.

[4376] The parsing of this form of ‘#include’ is slightly specialbecause comments are not recognized within the ‘< . . . >’. Thus, in‘#include <x/*y>’ the ‘/*’ does not start a comment and the directivespecifies inclusion of a system header file named ‘x/*y’. Of course, aheader file with such a name is unlikely to exist on Unix, where shellwildcard features would make it hard to manipulate. The argument filemay not contain a ‘<’ character. It may, however, contain a ‘>’character.

[4377] #include “file”

[4378] This variant is used for header files of the program. It searchesfor a file named file first in the current directory, then in the samedirectories used for system header files. The current directory is thedirectory of the current input file. It is tried first because it ispresumed to be the location of the files that the current input filerefers to. (If the ‘-I-’ option is used, the special treatment of thecurrent directory is inhibited.)

[4379] The argument file may not contain”” characters. If backslashesoccur within file, they are considered ordinary text characters, notescape characters. None of the character escape sequences appropriate tostring constants in C are processed. Thus, ‘#include “x\n\\y”’ specifiesa filename containing three backslashes. It is not clear why thisbehavior is ever useful, but the ANSI standard specifies it.

[4380] #include anything else

[4381] This variant is called a computed #include. Any ‘#include’directive whose argument does not fit the above two forms is a computedinclude. The text anything else is checked for macro calls, which areexpanded. When this is done, the result may fit one of the above twovariants—in particular, the expanded text may in the end be surroundedby either quotes or angle braces.

[4382] This feature allows one to define a macro which controls the filename to be used at a later point in the program. One application of thisis to allow a site-specific configuration file for the program tospecify the names of the system include files to be used. This can helpin porting the program to various operating systems in which thenecessary system header files are found in different places.

[4383] How ‘#include’ Works

[4384] The ‘#include’ directive works by directing the C preprocessor toscan the specified file as input before continuing with the rest of thecurrent file. The output from the preprocessor contains the outputalready generated, followed by the output resulting from the includedfile, followed by the output that comes from the text after the‘#include’ directive. For example, given a header file ‘header.h’ asfollows,

[4385] char *test ( );

[4386] and a main program called ‘program.c’ that uses the header file,like this, int x; #include “header.h” main( ) { printf (test ( )); }

[4387] the output generated by the C preprocessor for ‘program.c’ asinput would be int x; char *test ( ); main( ) { printf (test ( )); }

[4388] Included files are not limited to declarations and macrodefinitions; those are merely the typical uses. Any fragment of a Cprogram can be included from another file. The include file could evencontain the beginning of a statement that is concluded in the containingfile, or the end of a statement that was started in the including file.However, a comment or a string or character constant may not start inthe included file and finish in the including file. An unterminatedcomment, string constant or character constant in an included file isconsidered to end (with an error message) at the end of the file.

[4389] It is possible for a header file to begin or end a syntactic unitsuch as a function definition, but that would be very confusing, sodon't do it.

[4390] The line following the ‘#include’ directive is always treated asa separate line by the C preprocessor even if the included file lacks afinal newline.

[4391] Once-Only Include Files

[4392] Very often, one header file includes another. It can easilyresult that a certain header file is included more than once. This maylead to errors, if the header file defines structure types or typedefs,and is certainly wasteful. Therefore, it is often desired to preventmultiple inclusion of a header file.

[4393] The standard way to do this is to enclose the entire realcontents of the file in a conditional, like this:

[4394] #ifndef FILE_FOO_SEEN

[4395] #define FILE_FOO_SEEN

[4396] the entire file

[4397] #endif/* FILE_FOO_SEEN */

[4398] The macro FILE_FOO_SEEN indicates that the file has been includedonce already. In a user header file, the macro name should not beginwith ‘_’. In a system header file, this name should begin with ‘_’ toavoid conflicts with user programs. In any kind of header file, themacro name should contain the name of the file and some additional text,to avoid conflicts with other header files.

[4399] The GNU C preprocessor is programmed to notice when a header fileuses this particular construct and handle it efficiently. If a headerfile is contained entirely in a ‘#ifndef’ conditional, then it recordsthat fact. If a subsequent ‘#include’ specifies the same file, and themacro in the ‘#ifndef’ is already defined, then the file is entirelyskipped, without even reading it.

[4400] There is also an explicit directive to tell the preprocessor thatit need not include a file more than once. This is called ‘#pragmaonce’, and was used in addition to the ‘#ifndef’ conditional around thecontents of the header file. ‘#pragma once’ is now obsolete and shouldnot be used at all.

[4401] In the Objective C language, there is a variant of ‘#include’called ‘#import’ which includes a file, but does so at most once. If oneuses ‘#import’ instead of ‘#include’, then he or she doesn't need theconditionals inside the header file to prevent multiple execution of thecontents.

[4402] ‘#import’ is obsolete because it is not a well designed feature.It requires the users of a header file—the applications programmers—toknow that a certain header file should only be included once. It is muchbetter for the header file's implementor to write the file so that usersdon't need to know this. Using ‘#ifndef’ accomplishes this goal.

[4403] Inheritance and Header Files

[4404] Inheritance is what happens when one object or file derives someof its contents by virtual copying from another object or file. In thecase of C header files, inheritance means that one header file includesanother header file and then replaces or adds something.

[4405] If the inheriting header file and the base header file havedifferent names, then inheritance is straightforward: simply write‘#include “base”’ in the inheriting file.

[4406] Sometimes it is necessary to give the inheriting file the samename as the base file. This is less straightforward.

[4407] For example, suppose an application program uses the systemheader file ‘sys/signal.h’, but the version of‘/usr/include/sys/signal.h’ on a particular system doesn't do what theapplication program expects. It might be convenient to define a “local”version, perhaps under the name ‘/usr/local/include/sys/signal.h’, tooverride or add to the one supplied by the system.

[4408] One can do this by using the option ‘-I.’ for compilation, andwriting a file ‘sys/signal.h’ that does what the application programexpects. But making this file include the standard ‘sys/signal.h’ is notso easy—writing ‘#include <sys/signal.h>’ in that file doesn't work,because it includes the version of the file, not the standard systemversion. Used in that file itself, this leads to an infinite recursionand a fatal error in compilation.

[4409] ‘#include </usr/include/sys/signal.h>’ would find the properfile, but that is not clean, since it makes an assumption about wherethe system header file is found. This is bad for maintenance, since itmeans that any change in where the system's header files are keptrequires a change somewhere else.

[4410] The clean way to solve this problem is to use ‘#include next’,which means, “Include the next file with this name.” This directiveworks like ‘#include’ except in searching for the specified file: itstarts searching the list of header file directories after the directoryin which the current file was found.

[4411] Suppose one specify ‘-I/usr/local/include’, and the list ofdirectories to search also includes ‘/usr/include’; and suppose thatboth directories contain a file named ‘sys/signal.h’. Ordinary ‘#include<sys/signal.h>’ finds the file under ‘/usr/local/include’. If that filecontains ‘#include_next <sys/signal.h>’, it starts searching after thatdirectory, and finds the file in ‘/usr/include’ . . . 4. Macros

[4412] Introduction

[4413] A macro is a sort of abbreviation which one can define once andthen use later. There are many complicated features associated withmacros in the C preprocessor.

[4414] Simple Macros

[4415] A simple macro is a kind of abbreviation. It is a name whichstands for a fragment of code. Some people refer to these as manifestconstants.

[4416] Before one can use a macro, he or she may define it explicitlywith the ‘#define’ directive. ‘#define’ is followed by the name of themacro and then the code it should be an abbreviation for. For example,

[4417] #define BUFFER_SIZE 1020

[4418] defines a macro named ‘BUFFER_SIZE’ as an abbreviation for thetext ‘1020’. If somewhere after this ‘#define’ directive there comes a Cstatement of the form

[4419] foo = (char *) xmalloc (BUFFER_SIZE);

[4420] then the C preprocessor may recognize and expand the macro‘BUFFER_SIZE’, resulting in

[4421] foo = (char *) xmalloc (1020);

[4422] The use of all upper case for macro names is a standardconvention. Programs are easier to read when it is possible to tell at aglance which names are macros. Normally, a macro definition may be asingle line, like all C preprocessing directives. (One can split a longmacro definition cosmetically with Backslash-Newline.) There is oneexception: Newlines can be included in the macro definition if within astring or character constant. This is because it is not possible for amacro definition to contain an unbalanced quote character; thedefinition automatically extends to include the matching quote characterthat ends the string or character constant. Comments within a macrodefinition may contain Newlines, which make no difference since thecomments are entirely replaced with Spaces regardless of their contents.

[4423] Aside from the above, there is no restriction on what can go in amacro body. Parentheses need not balance. The body need not resemblevalid C code. (But if it does not, one may get error messages from the Ccompiler when one uses the macro.)

[4424] The C preprocessor scans the program sequentially, so macrodefinitions take effect at the place write them. Therefore, thefollowing input to the C preprocessor:

[4425] foo = X;

[4426] #define X 4

[4427] bar = X,

[4428] produces as output foo = X;

[4429] bar = 4;

[4430] After the preprocessor expands a macro name, the macro'sdefinition body is appended to the front of the remaining input, and thecheck for macro calls continues. Therefore, the macro body can containcalls to other macros. For example, after

[4431] #define BUFSIZE 1020

[4432] #define TABLESIZE BUFSIZE

[4433] the name ‘TABLESIZE’ when used in the program would go throughtwo stages of expansion, resulting ultimately in ‘1020’. This is not atall the same as defining ‘TABLESIZE’ to be ‘1020’. The ‘#define’ for‘TABLESIZE’ uses exactly the body one specify—in this case,‘BUFSIZE’—and does not check to see whether it too is the name of amacro. It's only when one uses ‘TABLESIZE’ that the result of itsexpansion is checked for more macro names.

[4434] Macros with Arguments

[4435] A simple macro always stands for exactly the same text, each timeit is used. Macros can be more flexible when they accept arguments.Arguments are fragments of code that one supplies each time the macro isused. These fragments are included in the expansion of the macroaccording to the directions in the macro definition. A macro thataccepts arguments is called a function-like macro because the syntax forusing it looks like a function call. To define a macro that usesarguments, one writes a ‘#define’ directive with a list of argumentnames in parentheses after the name of the macro. The argument names maybe any valid C identifiers, separated by commas and optionallywhitespace. The open parenthesis may follow the macro name immediately,with no space in between.

[4436] For example, here is a macro that computes the minimum of twonumeric values, as it is defined in many C programs:

[4437] #define min(X, Y) ((X) <(Y) ? (X): (Y))

[4438] To use a macro that expects arguments, one writes the name of themacro followed by a list of actual arguments in parentheses, separatedby commas. The number of actual arguments one gives may match the numberof arguments the macro expects. Examples of use of the macro ‘min’include ‘min (1, 2)’ and ‘min (x + 28, *p)’.

[4439] The expansion text of the macro depends on the arguments a personuses. Each of the argument names of the macro is replaced, throughoutthe macro definition, with the corresponding actual argument. Using thesame macro ‘min’ defined above, ‘min (1, 2)’ expands into (1) < (2) ?(1): (2)) where ‘1’ has been substituted for ‘X’ and ‘2’ for ‘Y’.Likewise, ‘min (x + 28, *p)’ expands into.

[4440] ((x + 28) < (*p) ? (x + 28): (*p))

[4441] Parentheses in the actual arguments may balance; a comma withinparentheses does not end an argument. However, there is no requirementfor brackets or braces to balance, and they do not prevent a comma fromseparating arguments. Thus, macro (array[x = y, x + 11) passes twoarguments to macro: ‘array[x = y’ and ‘x + 1]’. If one wants to supply‘array[x = y, x + 1]’ as an argument, one may write it as ‘array[x = y,x + 1]’, which is equivalent C code.

[4442] After the actual arguments are substituted into the macro body,the entire result is appended to the front of the remaining input, andthe check for macro calls continues. Therefore, the actual arguments cancontain calls to other macros, either with or without arguments, or evento the same macro. The macro body can also contain calls to othermacros. For example, ‘min (min (a, b), c)’ expands into this text:

[4443] ((((a) < (b) ? (a) : (b))) < (c)

[4444] ? (((a) < (b) ? (a) : (b)))

[4445] : (c))

[4446] (Line breaks shown here for clarity would not actually begenerated.) If a macro foo takes one argument, and one wants to supplyan empty argument, he or she may write at least some whitespace betweenthe parentheses, like this: ‘foo ( )’. Just ‘foo ( )’ is providing noarguments, which is an error if foo expects an argument. But ‘foo0 ( )’is the correct way to call a macro defined to take zero arguments, likethis:

[4447] #define foo0( )

[4448] If one uses the macro name followed by something other than anopen-parenthesis (after ignoring any spaces, tabs and comments thatfollow), it is not a call to the macro, and the preprocessor does notchange what he or she has written. Therefore, it is possible for thesame name to be a variable or function in the program as well as amacro, and one can choose in each instance whether to refer to the macro(if an actual argument list follows) or the variable or function (if anargument list does not follow).

[4449] Such dual use of one name could be confusing and should beavoided except when the two meanings are effectively synonymous: thatis, when the name is both a macro and a function and the two havesimilar effects. One can think of the name simply as a function; use ofthe name for purposes other than calling it (such as, to take theaddress) may refer to the function, while calls may expand the macro andgenerate better but equivalent code. For example, one can use a functionnamed ‘min’ in the same source file that defines the macro. If onewrites ‘&min’ with no argument list, one refers to the function. If onewrites ‘min (x, bb)’, with an argument list, the macro is expanded. Ifone writes ‘(min) (a, bb)’, where the name ‘min’ is not followed by anopen-parenthesis, the macro is not expanded, so one winds up with a callto the function ‘min’.

[4450] One may not define the same name as both a simple macro and amacro with arguments. In the definition of a macro with arguments, thelist of argument names may follow the macro name immediately with nospace in between. If there is a space after the macro name, the macro isdefined as taking no arguments, and all the rest of the line is taken tobe the expansion. The reason for this is that it is often useful todefine a macro that takes no arguments and whose definition begins withan identifier in parentheses. This rule about spaces makes it possiblefor one to do either this:

[4451] #define FOO(x)−1 / (x)

[4452] (which defines ‘FOO’ to take an argument and expand into minusthe reciprocal of that argument) or this:

[4453] #define BAR (x) − 1 / (x).

[4454] (which defines ‘BAR’ to take no argument and always expand into‘(x) − 1 / (x)’).

[4455] Note that the uses of a macro with arguments can have spacesbefore the left parenthesis; it's the definition where it matterswhether there is a space.

[4456] Predefined Macros

[4457] Several simple macros are predefined. One can use them withoutgiving definitions for them. They fall into two classes: standard macrosand system-specific macros.

[4458] Standard Predefined Macros

[4459] The standard predefined macros are available with the samemeanings regardless of the machine or operating system on which one isusing GNU C. Their names all start and end with double underscores.Those preceding _GNUC_ in this table are standardized by ANSI C; therest are GNU C extensions.

[4460] _FILE_(—)

[4461] This macro expands to the name of the current input file, in theform of a C string constant. The precise name returned is the one thatwas specified in ‘#include’ or as the input file name argument.

[4462] _LINE_(—)

[4463] This macro expands to the current input line number, in the formof a decimal integer constant. While one can call it a predefined macro,it's a pretty strange macro, since its “definition” changes with eachnew line of source code. This and ‘_FILE_’ are useful in generating anerror message to report an inconsistency detected by the program; themessage can state the source line at which the inconsistency wasdetected. For example,

[4464] fprintf (stderr, “Internal error:”

[4465] “negative string length”

[4466] “%d at %s, line %d.”,

[4467] length, _FILE_, _LINE_;

[4468] A ‘#include’ directive changes the expansions of ‘_FILE_’ and‘_LINE_’ to correspond to the included file. At the end of that file,when processing resumes on the input file that contained the ‘#include’directive, the expansions of ‘_FILE_’ and ‘_LINE_’ revert to the valuesthey had before the ‘#include’ (but ‘_LINE_’ is then incremented by oneas processing moves to the line after the ‘#include’).

[4469] The expansions of both ‘_FILE_’ and ‘_LINE_’ are altered if a‘#line’ directive is used.

[4470] _DATE_(—)

[4471] This macro expands to a string constant that describes the dateon which the preprocessor is being run. The string constant containseleven characters and looks like ‘“Jan 29 1987”’ or ‘“Apr 1 1905”’.

[4472] _TIME_(—)

[4473] This macro expands to a string constant that describes the timeat which the preprocessor is being run. The string constant containseight characters and looks like

[4474] ‘“23:59:01”’.

[4475] _STDC_(—)

[4476] This macro expands to the constant 1, to signify that this isANSI Standard C. (Whether that is actually true depends on what Ccompiler may operate on the output from the preprocessor.)

[4477] _STDC_VERSION_(—)

[4478] This macro expands to the C Standard's version number, a longinteger constant of the form ‘yyyymmL’ where yyyy and mm are the yearand month of the Standard version. This signifies which version of the CStandard the preprocessor conforms to. Like ‘_STDC_’, whether thisversion number is accurate for the entire implementation depends on whatC compiler may operate on the output from the preprocessor.

[4479] _GNUC_(—)

[4480] This macro is defined if and only if this is GNU C. This macro isdefined only when the entire GNU C compiler is in use; if one invokesthe preprocessor directly, ‘_GNUC_’ is undefined.

[4481] The value identifies the major version number of GNU CC (‘1’ forGNU CC version 1, which is now obsolete, and ‘2’ for version 2).

[4482] _GNUC_MINOR_(—)

[4483] The macro contains the minor version number of the compiler. Thiscan be used to work around differences between different releases of thecompiler (for example, if gcc 2.6.3 is known to support a feature, onecan test for _GNUC_(—)>2 ∥ (_GNUC_(—)== 2 && _GNUC_MINOR_(—)>=6)). Thelast number, ‘3’ in the example above, denotes the bugfix level of thecompiler; no macro contains this value.

[4484] _GNUG_(—)

[4485] The GNU C compiler defines this when the compilation language isC++; use ‘_GNUG_’ to distinguish between GNU C and GNU C++.

[4486] _cplusplus

[4487] The draft ANSI standard for C++ used to require predefining thisvariable. Though it is no longer required, GNU C++ continues to defineit, as do other popular C++ compilers. One can use ‘_cplusplus’ to testwhether a header is compiled by a C compiler or a C++ compiler.

[4488] _STRICT_ANSI_(—)

[4489] This macro is defined if and only if the ‘-ansi’ switch wasspecified when GNU C was invoked. Its definition is the null string.This macro exists primarily to direct certain GNU header files not todefine certain traditional Unix constructs which are incompatible withANSI C.

[4490] _BASE_FILE_(—)

[4491] This macro expands to the name of the main input file, in theform of a C string constant. This is the source file that was specifiedas an argument when the C compiler was invoked.

[4492] _INCLUDE_LEVEL_(—)

[4493] This macro expands to a decimal integer constant that representsthe depth of nesting in include files. The value of this macro isincremented on every ‘#include’ directive and decremented at every endof file. For input files specified by command line arguments, thenesting level is zero.

[4494] _VERSION_(—)

[4495] This macro expands to a string which describes the version numberof GNU C. The string is normally a sequence of decimal numbers separatedby periods, such as ‘“2.6.0”’.

[4496] The only reasonable use of this macro is to incorporate it into astring constant.

[4497] _OPTIMIZE_(—)

[4498] This macro is defined in optimizing compilations. It causescertain GNU header files to define alternative macro definitions forsome system library functions. It is unwise to refer to or test thedefinition of this macro unless one makes very sure that programs mayexecute with the same effect regardless.

[4499] _CHAR_UNSIGNED_(—)

[4500] This macro is defined if and only if the data type char isunsigned on the target machine. It exists to cause the standard headerfile ‘limit.h’ to work correctly. It is bad practice to refer to thismacro; instead, it is best to refer to the standard macros defined in‘limit.h’. The preprocessor uses this macro to determine whether or notto sign-extend large character constants written in octal.

[4501] _REGISTER_PREFIX_(—)

[4502] This macro expands to a string describing the prefix applied tocpu registers in assembler code. It can be used to write assembler codethat is usable in multiple environments. For example, in the ‘m68k-aout’environment it expands to the string ‘“”’, but in the ‘m68k-coff’environment it expands to the string ‘“%”’.

[4503] _USER-LABEL_PREFIX_(—)

[4504] This macro expands to a string describing the prefix applied touser generated labels in assembler code. It can be used to writeassembler code that is usable in multiple environments. For example, inthe ‘m68k-aout’ enviroment it expands to the string ‘“_”’, but in the‘m68k-coff’ environment it expands to the string ‘“”’.

[4505] Nonstandard Predefined Macros

[4506] The C preprocessor normally has several predefined macros thatvary between machines because their purpose is to indicate what type ofsystem and machine is in use. This description, being for all systemsand machines, cannot tell one exactly what their names are; instead, alist of some typical ones is offered. One can use ‘cpp -dM’ to see thevalues of predefined macros.

[4507] Some nonstandard predefined macros describe the operating systemin use, with more or less specificity. For example, unix ‘unix’ isnormally predefined on all Unix systems.

[4508] BSD

[4509] ‘BSD’ is predefined on recent versions of Berkeley Unix. Othernonstandard predefined macros describe the kind of CPU, with more orless specificity. For example,

[4510] vax

[4511] ‘vax’ is predefined on Vax computers.

[4512] mc68000

[4513] ‘mc68000’ is predefined on most computers whose CPU is a Motorola68000, 68010 or 68020.

[4514] m68k

[4515] ‘m68k’ is also predefined on most computers whose CPU is a 68000,68010 or 68020; however, some makers use ‘mc68000’ and some use ‘m68k’.Some predefine both names. What happens in GNU C depends on the systemone is using it on.

[4516] M68020

[4517] ‘M68020’ has been observed to be predefined on some systems thatuse 68020 CPUs—in addition to ‘mc68000’ and ‘m68k’, which are lessspecific.

[4518] _AM29K, _AM29000

[4519] Both ‘_AM29K’ and ‘_AM29000’ are predefined for the AMD 29000 CPUfamily.

[4520] ns32000

[4521] ‘ns32000’ is predefined on computers which use the NationalSemiconductor 32000 series CPU. Yet other nonstandard predefined macrosdescribe the manufacturer of the system. For example,

[4522] sun

[4523] ‘sun’ is predefined on all models of Sun computers.

[4524] Pyr

[4525] ‘pyr’ is predefined on all models of Pyramid computers.

[4526] Sequent

[4527] ‘sequent’ is predefined on all models of Sequent computers.

[4528] These predefined symbols are not only nonstandard, they arecontrary to the ANSI standard because their names do not start withunderscores. Therefore, the option ‘-ansi’ inhibits the definition ofthese symbols.

[4529] This tends to make ‘-ansi’ useless, since many programs depend onthe customary nonstandard predefined symbols. Even system header filescheck them and may generate incorrect declarations if they do not findthe names that are expected. One might think that the header filessupplied for the Uglix computer would not need to test what machine theyare running on, because they can simply assume it is the Uglix; butoften they do, and they do so using the customary names. As a result,very few C programs may compile with ‘-ansi’. It is intended to avoidsuch problems on the GNU system.

[4530] What, then, should one do in an ANSI C program to test the typeof machine it may run on? GNU C offers a parallel series of symbols forthis purpose, whose names are made from the customary ones by adding ‘_’at the beginning and end. Thus, the symbol _vax_ would be available on aVax, and so on.

[4531] The set of nonstandard predefined names in the GNU C preprocessoris controlled (when cpp is itself compiled) by the macro‘CPP_PREDEFINES’, which should be a string containing ‘-D’ options,separated by spaces. For example, on the Sun 3, use the followingdefinition is used:

[4532] #define CPP_PREDEFINES “-Dmc68000 -Dsun -Dunix -Dm68k”

[4533] This macro is usually specified in ‘tm.h’.

[4534] Stringification

[4535] Stringification means turning a code fragment into a stringconstant whose contents are the text for the code fragment. For example,stringifying ‘foo (z)’ results in ‘“foo (Z)”’.

[4536] In the C preprocessor, stringification is an option availablewhen macro arguments are substituted into the macro definition. In thebody of the definition, when an argument name appears, the character ‘#’before the name specifies stringification of the corresponding actualargument when it is substituted at that point in the definition. Thesame argument may be substituted in other places in the definitionwithout stringification if the argument name appears in those placeswith no ‘#’.

[4537] Here is an example of a macro definition that usesstringification:

[4538] #define WARN_IF(EXP)\

[4539] do { if (EXP)\

[4540] fprintf(stderr, “Warning: ” #EXP “\n”); }\

[4541] while (0)

[4542] Here the actual argument for ‘EXP’ is substituted once as given,into the ‘if’ statement, and once as stringified, into the argument to‘fprintf’. The ‘do’ and ‘while (0)’ are a kludge to make it possible towrite ‘WARN_IF (arg);’, which the resemblance of ‘WARN_IF’ to a functionwould make C programmers want to do.

[4543] The stringification feature is limited to transforming one macroargument into one string constant: there is no way to combine theargument with other text and then stringify it all together. But theexample above shows how an equivalent result can be obtained in ANSIStandard C using the feature that adjacent string constants areconcatenated as one string constant. The preprocessor stringifies theactual value of ‘EXP’ into a separate string constant, resulting in textlike.

[4544] do { if(x == 0) \

[4545] fprintf (stderr, “Warning:” “x == 0” “\n”); } \

[4546] while (0)

[4547] but the C compiler then sees three consecutive string constantsand concatenates them into one, producing effectively

[4548] do { if(x ==0) \

[4549] fprintf (stderr, “Warning: x == 0\n”); } \

[4550] while (0)

[4551] Stringification in C involves more than putting doublequotecharacters around the fragment; it is necessary to put backslashes infront of all doublequote characters, and all backslashes in string andcharacter constants, in order to get a valid C string constant with theproper contents. Thus, stringifying ‘p = “foon”;’ results in ‘“p =\“foo\\n\”;”’. However, backslashes that are not inside of string orcharacter constants are not duplicated: ‘\n’ by itself stringifies to‘“\n”’. Whitespace (including comments) in the text being stringified ishandled according to precise rules. All leading and trailing whitespaceis ignored. Any sequence of whitespace in the middle of the text isconverted to a single space in the stringified result.

[4552] Concatenation

[4553] Concatenation means joining two strings into one. In the contextof macro expansion, concatenation refers to joining two lexical unitsinto one longer one. Specifically, an actual argument to the macro canbe concatenated with another actual argument or with fixed text toproduce a longer name. The longer name might be the name of a function,variable or type, or a C keyword; it might even be the name of anothermacro, in which case it may be expanded.

[4554] When one defines a macro, he or she requests concatenation withthe special operator ‘##’ in the macro body. When the macro is called,after actual arguments are substituted, all ‘##’ operators are deleted,and so is any whitespace next to them (including whitespace that waspart of an actual argument). The result is to concatenate the syntactictokens on either side of the ‘##’.

[4555] Consider a C program that interprets named commands. Thereprobably needs to be a table of commands, perhaps an array of structuresdeclared as follows:

[4556] struct command struct command { char *name; void (*function) ( );}; struct command commands[] = { {“quit”, quit_command}, {“help”,help_command}, ... };

[4557] It would be cleaner not to have to give each command name twice,once in the string constant and once in the function name. A macro whichtakes the name of a command as an argument can make this unnecessary.The string constant can be created with stringification, and thefunction name by concatenating the argument with ‘_command’. Here is howit is done:

[4558] #define COMMAND(NAME) { #NAME, NAME ##_command }

[4559] struct command commands[ ] =

[4560] {

[4561] COMMAND (quit),

[4562] COMMAND (help),

[4563] . . .

[4564] };

[4565] The usual case of concatenation is concatenating two names (or aname and a number) into a longer name. But this isn't the only validcase. It is also possible to concatenate two numbers (or a number and aname, such as ‘1.5’ and ‘e3’) into a number. Also, multicharacteroperators such as ‘+=’ can be formed by concatenation. In some cases itis even possible to piece together a string constant. However, twopieces of text that don't together form a valid lexical unit cannot beconcatenated. For example, concatenation with ‘x’ on one side and ‘+ ’on the other is not meaningful because those two characters can't fittogether in any lexical unit of C. The ANSI standard says that suchattempts at concatenation are undefined, but in the GNU C preprocessorit is well defined: it puts the ‘x’ and ‘+’ side by side with noparticular special results. Keep in mind that the C preprocessorconverts comments to whitespace before macros are even considered.Therefore, one cannot create a comment by concatenating ‘/’ and ‘*’: the‘/*’ sequence that starts a comment is not a lexical unit, but ratherthe beginning of a “long” space character. Also, one can freely usecomments next to a ‘##’ in a macro definition, or in actual argumentsthat may be concatenated, because the comments may be converted tospaces at first sight, and concatenation may later discard the spaces.

[4566] Undefining Macros

[4567] To undefine a macro means to cancel its definition. This is donewith the ‘#undef’ directive. ‘#undef’ is followed by the macro name tobe undefined.

[4568] Like definition, undefinition occurs at a specific point in thesource file, and it applies starting from that point. The name ceases tobe a macro name, and from that point on it is treated by thepreprocessor as if it had never been a macro name.

[4569] For example,

[4570] #define FOO 4

[4571] x = FOO;

[4572] #undef FOO

[4573] x = FOO;

[4574] expands into

[4575] x 4;

[4576] x = FOO;

[4577] In this example, ‘FOO’ had better be a variable or function aswell as (temporarily) a macro, in order for the result of the expansionto be valid C code.

[4578] The same form of ‘#undef’ directive may cancel definitions witharguments or definitions that don't expect arguments. The ‘#undef’directive has no effect when used on a name not currently defined as amacro.

[4579] Redefining Macros

[4580] Redefining a macro means defining (with ‘#define’) a name that isalready defined as a macro. A redefinition is trivial if the newdefinition is transparently identical to the old one. One probablywouldn't deliberately write a trivial redefinition, but they can happenautomatically when a header file is included more than once, so they areaccepted silently and without effect. Nontrivial redefinition isconsidered likely to be an error, so it provokes a warning message fromthe preprocessor. However, sometimes it is useful to change thedefinition of a macro in mid-compilation. One can inhibit the warning byundefining the macro with ‘#undef’ before the second definition. Inorder for a redefinition to be trivial, the new definition may exactlymatch the one already in effect, with two possible exceptions:

[4581] Whitespace may be added or deleted at the beginning or the end.Whitespace may be changed in the middle (but not inside strings).However, it may not be eliminated entirely, and it may not be addedwhere there was no whitespace at all. Recall that a comment counts aswhitespace.

[4582] Pitfalls and Subtleties of Macros

[4583] In this section, some special rules are described that apply tomacros and macro expansion, and point out certain cases in which therules have counterintuitive consequences that one may watch out for.

[4584] Improperly Nested Constructs

[4585] Recall that when a macro is called with arguments, the argumentsare substituted into the macro body and the result is checked, togetherwith the rest of the input file, for more macro calls. It is possible topiece together a macro call coming partially from the macro body andpartially from the actual arguments. For example,

[4586] #define double(x) (2*(x))

[4587] #define call_with_(—)1(x) x(1)

[4588] would expand ‘call_with_(—)1 (double)’ into ‘(2*(1))’. Macrodefinitions do not have to have balanced parentheses. By writing anunbalanced open parenthesis in a macro body, it is possible to create amacro call that begins inside the macro body but ends outside of it. Forexample,

[4589] #define strange(file) fprintf (file, “%s %d”,

[4590] . . .

[4591] strange(stderr) p, 35)

[4592] This bizarre example expands to ‘fprintf (stderr, “%s %d”, p,35)’!

[4593] Unintended Grouping of Arithmetic

[4594] One may have noticed that in most of the macro definitionexamples shown above, each occurrence of a macro argument name hadparentheses around it. In addition, another pair of parentheses usuallysurround the entire macro definition. Here is why it is best to writemacros that way.

[4595] Suppose one defines a macro as follows,

[4596] #define ceil_div(x, y) (x + y − 1) / y

[4597] whose purpose is to divide, rounding up. (One use for thisoperation is to compute how many ‘int’ objects are needed to hold acertain number of ‘char’ objects.) Then suppose it is used as follows:

[4598] a = ceil_div (b & c, sizeof (int));

[4599] This expands into

[4600] a = (b & c + sizeof (int) − 1) / sizeof

[4601] (int);

[4602] which does not do what is intended. The operator-precedence rulesof C make it equivalent to this:

[4603] a = (b & (c + sizeof(int) − 1)) /

[4604] sizeof(int);

[4605] But what is desired is this:

[4606] a = ((b & c) + sizeof(int)−1)) / sizeof(int);

[4607] Defining the macro as

[4608] #define ceil_div(x, y) ((x) + (y) − 1) / (y)

[4609] provides the desired result. However, unintended grouping canresult in another way. Consider ‘sizeof ceil_div(1, 2)’. That has theappearance of a C expression that would compute the size of the type of‘ceil_div (1, 2)’, but in fact it means something very different. Hereis what it expands to:

[4610] sizeof ((1) + (2) − 1) / (2)

[4611] This would take the size of an integer and divide it by two. Theprecedence rules have put the division outside the ‘sizeof’ when it wasintended to be inside. Parentheses around the entire macro definitioncan prevent such problems. Here, then, is the recommended way to define

[4612] ‘ceil div’:

[4613] #define ceil_div(x, y) (((x) + (y) − 1) /

[4614] (y))

[4615] Swallowing the Semicolon

[4616] Often it is desirable to define a macro that expands into acompound statement. Consider, for example, the following macro, thatadvances a pointer (the argument ‘p’ says where to find it) acrosswhitespace characters:

[4617] #define SKIP_SPACES (p, limit) \

[4618] { register char *lim (limit); \

[4619] while (p != lim) { \

[4620] if (*p++ != ‘ ’) {\

[4621] p−−; break; }}}

[4622] Here Backslash-Newline is used to split the macro definition,which may be a single line, so that it resembles the way such C codewould be laid out if not part of a macro definition. A call to thismacro might be ‘SKIP_SPACES (p, lim)’. Strictly speaking, the callexpands to a compound statement, which is a complete statement with noneed for a semicolon to end it. But it looks like a function call. So itminimizes confusion if one can use it. like a function call, writing asemicolon afterward, as in

[4623] ‘SKIP_SPACES (p, lim);’

[4624] But this can cause trouble before ‘else’ statements, because thesemicolon is actually a null statement. Suppose one writes:

[4625] if (*p != 0)

[4626] SKIP_SPACES (p, lim);

[4627] else

[4628] . . .

[4629] The presence of two statements—the compound statement and a nullstatement—in between the ‘if’ condition and the ‘else’ makes invalid Ccode.

[4630] The definition of the macro ‘SKIP_SPACES’ can be altered to solvethis problem, using a ‘do . . . while’ statement. Here is how:

[4631] #define SKIP_SPACES (p, limit) \

[4632] do { register char *lim = (limit); \

[4633] while (p != lim) { \

[4634] if(*p++ != ‘ ’) { \

[4635] p−−; break; }}} \

[4636] while (0)

[4637] Now ‘SKIP_SPACES (p, lim);’ expands into

[4638] do { . . . } while (0);

[4639] which is one statement.

[4640] Duplication of Side Effects

[4641] Many C programs define a macro ‘min’, for “minimum”, like this:

[4642] #define min(X, Y) ((X) < (Y) ? (X): (Y))

[4643] When one uses this macro with an argument containing a sideeffect, as shown here,

[4644] next = min (x + y, foo (z));

[4645] it expands as follows:

[4646] next ((x+y)<(foo (z)) ? (x+y) (foo (z)));

[4647] where ‘x + y’ has been substituted for ‘X’ and ‘foo (z)’ for ‘Y’.The function ‘foo’ is used only once in the statement as it appears inthe program, but the expression ‘foo (z)’ has been substituted twiceinto the macro expansion. As a result, ‘foo’ might be called two timeswhen the statement is executed. If it has side effects or if it takes along time to compute, the results might not be what one intended. ‘min’is declared an unsafe macro. The best solution to this problem is todefine ‘min’ in a way that computes the value of ‘foo (z)’ only once.The C language offers no standard way to do this, but it can be donewith GNU C extensions as follows:

[4648] #define min(X, Y) \

[4649] ({ typeof (X)_x = (X),_y = (Y); \

[4650] (_x <_y) ? _x : _y; })

[4651] If one does not wish to use GNU C extensions, the only solutionis to be careful when using the macro ‘min’. For example, one cancalculate the value of ‘foo (z)’, save it in a variable, and use thatvariable in ‘min’:

[4652] #define min(X, Y) ((X) < (Y) ? (X) : (Y))

[4653] . . .

[4654] {

[4655] int tem = foo (z);

[4656] next = min (x + y, tem);

[4657] }

[4658] (where it is assumed that ‘foo’ returns type ‘int’).

[4659] Self-Referential Macros

[4660] A self-referential macro is one whose name appears in itsdefinition. A special feature of ANSI Standard C is that theself-reference is not considered a macro call. It is passed into thepreprocessor output unchanged.

[4661] Let's consider an example:

[4662] #define foo (4 + foo)

[4663] where ‘foo’ is also a variable in the program.

[4664] Following the ordinary rules, each reference to ‘foo’ may expandinto ‘(4 + foo)’; then this may be rescanned and may expand into ‘(4 +(4 + foo))’; and so on until it causes a fatal error (memory full) inthe preprocessor.

[4665] However, the special rule about self-reference cuts this processshort after one step, at ‘(4 +foo)’. Therefore, this macro definitionhas the possibly useful effect of causing the program to add 4 to thevalue of ‘foo’ wherever ‘foo’ is referred to. In most cases, it is a badidea to take advantage of this feature. A person reading the program whosees that ‘foo’ is a variable may not expect that it is a macro as well.The reader may come across the identifier ‘foo’ in the program and thinkits value should be that of the variable ‘foo’, whereas in fact thevalue is four greater. The special rule for self-reference applies alsoto indirect self-reference. This is the case where a macro x expands touse a macro ‘y’, and the expansion of ‘y’ refers to the macro ‘x’. Theresulting reference to ‘x’ comes indirectly from the expansion of ‘x’,so it is a self-reference and is not further expanded. Thus, after

[4666] #define x (4 + y)

[4667] #define y (2 * x)

[4668] ‘x’ would expand into ‘(4 + (2 * x))’. Clear? But suppose ‘y’ isused elsewhere, not from the definition of ‘x’. Then the use of ‘x’ inthe expansion of ‘y’ is not a self-reference because ‘x’ is not “inprogress”. So it does expand. However, the expansion of ‘x’ contains areference to ‘y’, and that is an indirect self-reference now because ‘y’is “in progress”. The result is that ‘y’ expands to ‘(2 * (4 + y))’. Itis not clear that this behavior would ever be useful, but it isspecified by the ANSI C standard, so one may need to understand it.

[4669] Separate Expansion of Macro Arguments

[4670] It has been explained that the expansion of a macro, includingthe substituted actual arguments, is scanned over again for macro callsto be expanded.

[4671] What really happens is more subtle: first each actual argumenttext is scanned separately for macro calls. Then the results of this aresubstituted into the macro body to produce the macro expansion, and themacro expansion is scanned again for macros to expand. The result isthat the actual arguments are scanned twice to expand macro calls inthem. Most of the time, this has no effect. If the actual argumentcontained any macro calls, they are expanded during the first scan. Theresult therefore contains no macro calls, so the second scan does notchange it. If the actual argument were substituted as given, with noprescan, the single remaining scan would find the same macro calls andproduce the same results. One might expect the double scan to change theresults when a self-referential macro is used in an actual argument ofanother macro: the self-referential macro would be expanded once in thefirst scan, and a second time in the second scan. But this is not whathappens. The self-references that do not expand in the first scan aremarked so that they may not expand in the second scan either.

[4672] The prescan is not done when an argument is stringified orconcatenated. Thus,

[4673] #define str(s) #s

[4674] #define foo 4

[4675] str (foo)

[4676] expands to ‘“foo”’. Once more, prescan has been prevented fromhaving any noticeable effect. More precisely, stringification andconcatenation use the argument as written, in unprescanned form. Thesame actual argument would be used in prescanned form if it issubstituted elsewhere without stringification or concatenation.

[4677] #define str(s) #s lose(s)

[4678] #define foo 4

[4679] str (foo)

[4680] expands to ‘“foo” lose(4)’.

[4681] One might now ask, “Why mention the prescan, if it makes nodifference? And why not skip it and make the preprocessor faster?” Theanswer is that the prescan does make a difference in three specialcases:

[4682] Nested calls to a macro.

[4683] Macros that call other macros that stringify or concatenate.

[4684] Macros whose expansions contain unshielded commas.

[4685] Nested calls to a macro occur when a macro's actual argumentcontains a call to that very macro. For example, if ‘f’ is a macro thatexpects one argument, ‘f (f (1))’ is a nested pair of calls to ‘f’. Thedesired expansion is made by expanding ‘f (1)’ and substituting thatinto the definition of ‘f’. The prescan causes the expected result tohappen. Without the prescan, ‘f (1)’ itself would be substituted as anactual argument, and the inner use of ‘f’ would appear during the mainscan as an indirect self-reference and would not be expanded. Here, theprescan cancels an undesirable side effect (in the medical, notcomputational, sense of the term) of the special rule forself-referential macros. But prescan causes trouble in certain othercases of nested macro calls. Here is an example:

[4686] #define foo a,b

[4687] #define bar(x) lose(x)

[4688] #define lose(x) (1 + (x))

[4689] bar(foo)

[4690] It is desired that ‘bar(foo)’ turn into ‘(1 + (foo))’, whichwould then turn into ‘(1 + (a,b))’. But instead, ‘bar(foo)’ expands into‘lose(a,b)’, and one get an error because lose requires a singleargument. In this case, the problem is easily solved by the sameparentheses that ought to be used to prevent misnesting of arithmeticoperations:

[4691] #define foo (a,b)

[4692] #define bar(x) lose((x))

[4693] The problem is more serious when the operands of the macro arenot expressions; for example, when they are statements. Then parenthesesare unacceptable because they would make for invalid C code:

[4694] #define foo { int a, b; . . . }.

[4695] In GNU C one can shield the commas using the ‘({ . . . })’construct which turns a compound statement into an expression:

[4696] #define foo ({ int a, b; . . . })

[4697] Or one can rewrite the macro definition to avoid such commas:

[4698] #define foo { int a; int b; . . . }

[4699] There is also one case where prescan is useful. It is possible touse prescan to expand an argument and then stringify it—if one uses twolevels of macros. Let's add a new macro ‘xstr’ to the example shownabove:

[4700] #define xstr(s) str(s)

[4701] #define str(s) #s

[4702] #define foo 4

[4703] xstr (foo)

[4704] This expands into ‘“4”’, not ‘“foo”’. The reason for thedifference is that the argument of ‘xstr’ is expanded at prescan(because ‘xstr’ does not specify stringification or concatenation of theargument). The result of prescan then forms the actual argument for‘str’. ‘str’ uses its argument without prescan because it performsstringification; but it cannot prevent or undo the prescanning alreadydone by ‘xstr’.

[4705] Cascaded Use of Macros

[4706] A cascade of macros is when one macro's body contains a referenceto another macro. This is very common practice. For example,

[4707] #define BUFSIZE 1020

[4708] #define TABLESIZE BUFSIZE.

[4709] This is not at all the same as defining ‘TABLESIZE’ to be ‘1020’.The ‘#define’ for ‘TABLESIZE’ uses exactly the body one specifies—inthis case, ‘BUFSIZE’—and does not check to see whether it too is thename of a macro.

[4710] It's only when one uses ‘TABLESIZE’ that the result of itsexpansion is checked for more macro names. This makes a difference ifone changes the definition of ‘BUFSIZE’ at some point in the sourcefile. ‘TABLESIZE’, defined as shown, may always expand using thedefinition of ‘BUFSIZE’ that is currently in effect:

[4711] #define BUFSIZE 1020

[4712] #define TABLESIZE BUFSIZE

[4713] #undef BUFSIZE

[4714] #define BUFSIZE 37

[4715] Now ‘TABLESIZE’ expands (in two stages) to ‘37’. (The ‘#undef’ isto prevent any warning about the nontrivial redefinition of BUFSIZE.)

[4716] Newlines in Macro Arguments

[4717] Traditional macro processing carries forward all newlines inmacro arguments into the expansion of the macro. This means that, ifsome of the arguments are substituted more than once, or not at all, orout of order, newlines can be duplicated, lost, or moved around withinthe expansion. If the expansion consists of multiple statements, thenthe effect is to distort the line numbers of some of these statements.The result can be incorrect line numbers, in error messages or displayedin a debugger. The GNU C preprocessor operating in ANSI C mode adjustsappropriately for multiple use of an argument—the first use expands allthe newlines, and subsequent uses of the same argument produce nonewlines. But even in this mode, it can produce incorrect line numberingif arguments are used out of order, or not used at all.

[4718] Here is an example illustrating this problem:

[4719] #define ignore-second-arg(a,b,c) a; c

[4720] ignore-second_arg (foo ( ),

[4721] ignored ( ),

[4722] syntax error);

[4723] The syntax error triggered by the tokens ‘syntax error’ resultsin an error message citing line four, even though the statement textcomes from line five.

[4724] Conditionals

[4725] Introduction

[4726] In a macro processor, a conditional is a directive that allows apart of the program to be ignored during compilation, on someconditions. In the C preprocessor, a conditional can test either anarithmetic expression or whether a name is defined as a macro. Aconditional in the C preprocessor resembles in some ways an ‘if’statement in C, but it is important to understand the difference betweenthem. The condition in an ‘if’ statement is tested during the executionof the program. Its purpose is to allow the program to behavedifferently from run to run, depending on the data it is operating on.The condition in a preprocessing conditional directive is tested whenthe program is compiled. Its purpose is to allow different code to beincluded in the program depending on the situation at the time ofcompilation.

[4727] Why Conditionals are Used

[4728] Generally there are three kinds of reason to use a conditional.

[4729] A program may need to use different code depending on the machineor operating system it is to run on. In some cases the code for oneoperating system may be erroneous on another operating system; forexample, it might refer to library routines that do not exist on theother system. When this happens, it is not enough to avoid executing theinvalid code: merely having it in the program makes it impossible tolink the program and run it. With a preprocessing conditional, theoffending code can be effectively excised from the program when it isnot valid.

[4730] One may want to be able to compile the same source file into twodifferent programs. Sometimes the difference between the programs isthat one makes frequent time-consuming consistency checks on itsintermediate data, or prints the values of those data for debugging,while the other does not.

[4731] A conditional whose condition is always false is a good way toexclude code from the program but keep it as a sort of comment forfuture reference.

[4732] Most simple programs that are intended to run on only one machinemay not need to use preprocessing conditionals.

[4733] Syntax of Conditionals

[4734] A conditional in the C preprocessor begins with a conditionaldirective: ‘#if’, ‘#ifdef’ or ‘#ifndef’. More information on ‘#ifdef’and ‘#ifndef’ will be set forth hereinafter with only ‘#if’ is explainedhere.

[4735] The ‘#if’ Directive

[4736] The ‘#if’ directive in its simplest form consists of

[4737] #if expression

[4738] controlled text

[4739] #endif /* expression */

[4740] The comment following the ‘#endif’ is not required, but it is agood practice because it helps people match the ‘#endif’ to thecorresponding ‘#if’. Such comments should always be used, except inshort conditionals that are not nested. In fact, one can put anything atall after the ‘#endif’ and it may be ignored by the GNU C preprocessor,but only comments are acceptable in ANSI Standard C. expression is a Cexpression of integer type, subject to stringent restrictions. It maycontain

[4741] Integer constants, which are all regarded as long or unsignedlong.

[4742] Character constants, which are interpreted according to thecharacter set and conventions of the machine and operating system onwhich the preprocessor is running. The GNU C preprocessor uses the Cdata type ‘char’ for these character constants; therefore, whether somecharacter codes are negative is determined by the C compiler used tocompile the preprocessor. If it treats ‘char’ as signed, then charactercodes large enough to set the sign bit may be considered negative;otherwise, no character code is considered negative.

[4743] Arithmetic operators for addition, subtraction, multiplication,division, bitwise operations, shifts, comparisons, and logicaloperations (‘&&’ and ‘∥’).

[4744] Identifiers that are not macros, which are all treated aszero(!). Macro calls. All macro calls in the expression are expandedbefore actual computation of the expression's value begins. Note that‘sizeof’ operators and enum-type values are not allowed. enum-typevalues, like all other identifiers that are not taken as macro calls andexpanded, are treated as zero.

[4745] The controlled text inside of a conditional can includepreprocessing directives. Then the directives inside the conditional areobeyed only if that branch of the conditional succeeds. The text canalso contain other conditional groups. However, the ‘#if’ and ‘#endif’directives may balance.

[4746] The ‘#else’ Directive

[4747] The ‘#else’ directive can be added to a conditional to providealternative text to be used if the condition is false. This is what itlooks like:

[4748] #if expression

[4749] text-if-true

[4750] #else /* Not expression */

[4751] text-if-false

[4752] #endif/* Not expression */

[4753] If expression is nonzero, and thus the text-if-true is active,then ‘#else’ acts like a failing conditional and the text-if-false isignored. Contrariwise, if the ‘#if’ conditional fails, the text-if-falseis considered included.

[4754] The ‘#elif’ Directive

[4755] One common case of nested conditionals is used to check for morethan two possible alternatives. For example, one might have

[4756] #if X == 1

[4757] . . .

[4758] #else /* X != 1 */

[4759] # if X == 2

[4760] . . .

[4761] #else /* X ! 2 */

[4762] . . .

[4763] #endif /* X != 2 */

[4764] #endif /* X !=1 */

[4765] Another conditional directive, ‘#elif’, allows this to beabbreviated as follows:

[4766] #if X == 1

[4767] . . .

[4768] #elif X == 2

[4769] . . .

[4770] #else /* X != 2 and X != 1

[4771] . . .

[4772] #endif /* X != 2 and X != 1*/

[4773] ‘#elif’ stands for “else if”. Like ‘#else’, it goes in the middleof a ‘#if’-‘#endif’ pair and subdivides it; it does not require amatching ‘#endif’ of its own. Like ‘#if’, the ‘#elif’ directive includesan expression to be tested.

[4774] The text following the ‘#elif’ is processed only if the original‘#if’-condition failed and the ‘#elif’ condition succeeds. More than one‘#elif’ can go in the same ‘#if’-‘#endif’ group. Then the text aftereach ‘#elif’ is processed only if the ‘#elif’ condition succeeds afterthe original ‘#if’ and any previous ‘#elif’ directives within it havefailed. ‘#else’ is equivalent to ‘#elif 1’, and ‘#else’ is allowed afterany number of ‘#elif’ directives, but ‘#elif’ may not follow ‘#else’.

[4775] Keeping Deleted Code for Future Reference

[4776] If one replaces or deletes a part of the program but want to keepthe old code around as a comment for future reference, the easy way todo this is to put ‘#if 0’ before it and ‘#endif’ after it. This isbetter than using comment delimiters ‘/*’ and ‘*/’ since those won'twork if the code already contains comments (C comments do not nest).This works even if the code being turned off contains conditionals, butthey may be entire conditionals (balanced ‘#if’ and ‘#endif’).

[4777] Conversely, do not use ‘#if 0’ for comments which are not C code.Use the comment delimiters ‘/*’ and ‘*/’ instead. The interior of ‘#if0’ may consist of complete tokens; in particular, single quotecharacters may balance. But comments often contain unbalancedsinglequote characters (known in English as apostrophes). These confuse‘#if 0’. They do not confuse ‘/*’.

[4778] Conditionals and Macros

[4779] Conditionals are useful in connection with macros or assertions,because those are the only ways that an expression's value can vary fromone compilation to another. A ‘#if’ directive whose expression uses nomacros or assertions is equivalent to ‘#if 1’ or ‘#if 0’; one might aswell determine which one, by computing the value of the expression, andthen simplify the program. For example, here is a conditional that teststhe expression:

[4780] ‘BUFSIZE == 1020’, where ‘BUFSIZE’ may be a macro.

[4781] #if BUFSIZE == 1020

[4782] printf (“Large buffers!\n”);

[4783] #endif /* BUFSIZE is large */

[4784] (Programmers often wish they could test the size of a variable ordata type in ‘#if’, but this does not work. The preprocessor does notunderstand sizeof, or typedef names, or even the type keywords such asint.)

[4785] The special operator ‘defined’ is used in ‘#if’ expressions totest whether a certain name is defined as a macro. Either ‘defined name’or ‘defined (name)’ is an expression whose value is 1 if name is definedas macro at the current point in the program, and 0 otherwise. For the‘defined’ operator it makes no difference what the definition of themacro is; all that matters is whether there is a definition. Thus, forexample,

[4786] #if defined (vax) If defined (ns16000)

[4787] would succeed if either of the names ‘vax’ and ‘ns16000’ isdefined as a macro. One can test the same condition using assertion,like this:

[4788] #if #cpu (vax) 11 #cpu (ns16000)

[4789] If a macro is defined and later undefined with ‘#undef’,subsequent use of the ‘defined’ operator returns 0, because the name isno longer defined. If the macro is defined again with another

[4790] ‘#define’,

[4791] ‘defined’ may recommence returning 1.

[4792] Conditionals that test whether just one name is defined are verycommon, so there are two special short conditional directives for thiscase.

[4793] #ifdef name is equivalent to ‘#fif defined (name)’.

[4794] #ifndef name is equivalent to ‘#if ! defined (name)’.

[4795] Macro definitions can vary between compilations for severalreasons.

[4796] Some macros are predefined on each kind of machine. For example,on a Vax, the name ‘vax’ is a predefined macro. On other machines, itwould not be defined.

[4797] Many more macros are defined by system header files. Differentsystems and machines define different macros, or give them differentvalues. It is useful to test these macros with conditionals to avoidusing a system feature on a machine where it is not implemented.

[4798] Macros are a common way of allowing users to customize a programfor different machines or applications. For example, the macro ‘BUFSIZE’might be defined in a configuration file for the program that isincluded as a header file in each source file. One would use ‘BUFSIZE’in a preprocessing conditional in order to generate different codedepending on the chosen configuration.

[4799] Macros can be defined or undefined with ‘-D’ and ‘-U’ commandoptions when one compiles the program. One can arrange to compile thesame source file into two different programs by choosing a macro name tospecify which program one want, writing conditionals to test whether orhow this macro is defined and then controlling the state of the macrowith compiler command options.

[4800] Assertions

[4801] Assertions are a more systematic alternative to macros in writingconditionals to test what sort of computer or system the compiledprogram may run on. Assertions are usually predefined, but one candefine them with preprocessing directives or command-line options. Themacros traditionally used to describe the type of target are notclassified in any way according to which question they answer; they mayindicate a hardware architecture, a particular hardware model, anoperating system, a particular version of an operating system, orspecific configuration options. These are jumbled together in a singlenamespace. In contrast, each assertion consists of a named question andan answer. The question is usually called the predicate. An assertionlooks like this:

[4802] # predicate (answer)

[4803] One may use a properly formed identifier for predicate. The valueof answer can be any sequence of words; all characters are significantexcept for leading and trailing whitespace, and differences in internalwhitespace sequences are ignored. Thus, ‘x + y’ is different from ‘x +y’ but equivalent to ‘x + y’. ‘)’ is not allowed in an answer.

[4804] Here is a conditional to test whether the answer is asserted forthe predicate:

[4805] #if # predicate (answer) There may be more than one answerasserted for a given predicate. If one omit the answer, one can testwhether any answer is asserted for predicate: #if # predicate

[4806] Most of the time, the assertions one test may be predefinedassertions. GNU C provides three predefined predicates: system, cpu, andmachine. system is for assertions about the type of software, cpudescribes the type of computer architecture, and machine gives moreinformation about the computer. For example, on a GNU system, thefollowing assertions would be true:

[4807] #system (gnu)

[4808] #system (mach)

[4809] #system (mach 3)

[4810] #system (mach 3. subversion)

[4811] #system (hurd)

[4812] #system (hurd version)

[4813] and perhaps others. The alternatives with more or less versioninformation let one ask more or less detailed questions about the typeof system software. On a Unix system, one would find #system (unix) andperhaps one of: #system (aix), #system (bsd), #system (hpux), #system(lynx), #system (mach), #system (posix), #system (svr3), #system (svr4),or #system (xpg4) with possible version numbers following.

[4814] Other values for system are #system (mvs) and #system (vms).Portability note: Many Unix C compilers provide only one answer for thesystem assertion: #system (unix), if they support assertions at all.This is less than useful. An assertion with a multi-word answer iscompletely different from several assertions with individual single-wordanswers. For example, the presence of system (mach 3.0) does not meanthat system (3.0) is true. It also does not directly imply system(mach), but in GNU C, that last may normally be asserted as well.

[4815] The current list of possible assertion values for cpu is:

[4816] #cpu

[4817] (a29k), #cpu (alpha), #cpu (arm), #cpu (clipper), #cpu

[4818] (convex), #cpu (elxsi), #cpu (tron), #cpu (h8300), #cpu

[4819] (i370), #cpu (i386), #cpu (i860), #cpu (i960), #cpu (m68k),

[4820] #cpu (m88k), #cpu (mips), #cpu (ns32k), #cpu (hppa), #cpu

[4821] (pyr), #cpu (ibmO32), #cpu (rs6000), #cpu (sh), #cpu.

[4822] (sparc), #cpu (spur), #cpu (tahoe), #cpu (vax), #cpu

[4823] (we32000).

[4824] One can create assertions within a C program using ‘#assert’,like this:

[4825] #assert predicate (answer)

[4826] (Note the absence of a ‘#’ before predicate.)

[4827] Each time one does this, one asserts a new true answer forpredicate. Asserting one answer does not invalidate previously assertedanswers; they all remain true. The only way to remove an assertion iswith ‘#unassert’. ‘#unassert’ has the same syntax as ‘#assert’. One canalso remove all assertions about predicate like this: #unassertpredicate. One can also add or cancel assertions using command optionswhen he or she runs gcc or cpp.

[4828] The ‘#error’ and ‘#warning’ Directives

[4829] The directive ‘#error’ causes the preprocessor to report a fatalerror. The rest of the line that follows ‘#error’ is used as the errormessage.

[4830] One would use ‘#error’ inside of a conditional that detects acombination of parameters which he or she knows the program does notproperly support. For example, if one knows that the program may not runproperly on a Vax, one might write #ifdef_vax_(—) #error Won't work onVaxen. See comments at get_last object. #endif

[4831] If one has several configuration parameters that may be set up bythe installation in a consistent way, he or she can use conditionals todetect an inconsistency and report it with ‘#error’. For example,

[4832] #if HASH_TABLE_SIZE % 2 == 0 ∥ HASH_TABLE_SIZE % 3 == 0 \

[4833] ∥ HASH_TABLE_SIZE % 5 == 0

[4834] #error HASH_TABLE_SIZE should not be divisible by a small \

[4835] prime

[4836] #endif

[4837] The directive ‘#warning’ is like the directive ‘#error’, butcauses the preprocessor to issue a warning and continue preprocessing.The rest of the line that follows ‘#warning’ is used as the warningmessage.

[4838] One might use ‘#warning’ in obsolete header files, with a messagedirecting the user to the header file which should be used instead.

[4839] Additional Preprocessor Information

[4840] Combining Source Files

[4841] One of the jobs of the C preprocessor is to inform the C compilerof where each line of C code came from: which source file and which linenumber.

[4842] C code can come from multiple source files if one use‘#include’;both ‘#include’ and the use of conditionals and macros cancause the line number of a line in the preprocessor output to bedifferent from the line's number in the original source file. One mayappreciate the value of making both the C compiler (in error messages)and symbolic debuggers such as GDB use the line numbers in the sourcefile.

[4843] The C preprocessor builds on this feature by offering a directiveby which one can control the feature explicitly. This is useful when afile for input to the C preprocessor is the output from another programsuch as the bison parser generator, which operates on another file thatis the true source file. Parts of the output from bison are generatedfrom scratch, other parts come from a standard parser file. The rest arecopied nearly verbatim from the source file, but their line numbers inthe bison output are not the same as their original line numbers.Naturally one would like compiler error messages and symbolic debuggersto know the original source file and line number of each line in thebison input. bison arranges this by writing ‘#line’ directives into theoutput file. ‘#line’ is a directive that specifies the original linenumber and source file name for subsequent input in the currentpreprocessor input file ‘#line’ has three variants:

[4844] #line linenum

[4845] Here linenum is a decimal integer constant. This specifies thatthe line number of the following line of input, in its original sourcefile, was linenum.

[4846] #line linenum filename

[4847] Here linenum is a decimal integer constant and filename is astring constant. This specifies that the following line of input cameoriginally from source file filename and its line number there waslinenum. Keep in mind that filename is not just a file name; it issurrounded by doublequote characters so that it looks like a stringconstant.

[4848] #line anything else

[4849] anything else is checked for macro calls, which are expanded. Theresult should be a decimal integer constant followed optionally by astring constant, as described above.

[4850] ‘#line’ directives alter the results of the ‘_FILE_’ and‘_LINES_’ predefined macros from that point on.

[4851] The output of the preprocessor (which is the input for the restof the compiler) contains directives that look much like ‘#line’directives. They start with just ‘#’ instead of ‘#line’, but this isfollowed by a line number and file name as in ‘#line’.

[4852] Miscellaneous Preprocessing Directives

[4853] This section describes three additional preprocessing directives.They are not very useful, but are mentioned for completeness. The nulldirective consists of a ‘#’ followed by a Newline, with only whitespace(including comments) in between. A null directive is understood as apreprocessing directive but has no effect on the preprocessor output.The primary significance of the existence of the null directive is thatan input line consisting of just a ‘#’ may produce no output, ratherthan a line of output containing just a ‘#’. Supposedly some old Cprograms contain such lines.

[4854] The ANSI standard specifies that the ‘#pragma’ directive has anarbitrary, implementation defined effect. In the GNU C preprocessor,‘#pragma’ directives are not used, except for ‘#pragma once’. However,they are left in the preprocessor output, so they are available to thecompilation pass. The ‘#ident’ directive is supported for compatibilitywith certain other systems. It is followed by a line of text. On somesystems, the text is copied into a special place in the object file; onmost systems, the text is ignored and this directive has no effect.Typically ‘#ident’ is only used in header files supplied with thosesystems where it is meaningful.

[4855] C Preprocessor Output

[4856] The output from the C preprocessor looks much like the input,except that all preprocessing directive lines have been replaced withblank lines and all comments with spaces. Whitespace within a line isnot altered; however, a space is inserted after the expansions of mostmacro calls.

[4857] Source file name and line number information is conveyed by linesof the form

[4858] # linenum filename flags

[4859] which are inserted as needed into the middle of the input (butnever within a string or character constant). Such a line means that thefollowing line originated in file filename at line linenum.

[4860] After the file name comes zero or more flags, which are ‘1’, ‘2’,‘3’, or ‘4’. If there are multiple flags, spaces separate them. Here iswhat the flags mean:

[4861] ‘1’ This indicates the start of a new file.

[4862] ‘2’ This indicates returning to a file (after having includedanother file).

[4863] ‘3’ This indicates that the following text comes from a systemheader file, so certain warnings should be suppressed.

[4864] ‘4’ This indicates that the following text should be treated asC.

[4865] Invoking the C Preprocessor

[4866] Introduction

[4867] Most often when one uses the C preprocessor he or she may nothave to invoke it explicitly: the C compiler may do so automatically.However, the preprocessor is sometimes useful on its own. The Cpreprocessor expects two file names as arguments, infile and outfile.The preprocessor reads infile together with any other files it specifieswith ‘#include’. All the output generated by the combined input files iswritten in outfile. Either infile or outfile may be ‘-’, which as infilemeans to read from standard input and as outfile means to write tostandard output. Also, if outfile or both file names are omitted, thestandard output and standard input are used for the omitted file names.

[4868] Command Line Options

[4869] Here is a table of command options accepted by the Cpreprocessor. These options can also be given when compiling a Cprogram; they are passed along automatically to the preprocessor when itis invoked by the compiler.

[4870] ‘-P’

[4871] Inhibit generation of ‘#’-lines with line-number information inthe output from the preprocessor. This might be useful when running thepreprocessor on something that is not C code and may be sent to aprogram which might be confused by the ‘#’-lines.

[4872] ‘-C’

[4873] Do not discard comments: pass them through to the output file.Comments appearing in arguments of a macro call may be copied to theoutput before the expansion of the macro call.

[4874] ‘-traditional’

[4875] Try to imitate the behavior of old-fashioned C, as opposed toANSI C.

[4876] Traditional macro expansion pays no attention to singlequote ordoublequote characters; macro argument symbols are replaced by theargument values even when they appear within apparent string orcharacter constants.

[4877] Traditionally, it is permissible for a macro expansion to end inthe middle of a string or character constant. The constant continuesinto the text surrounding the macro call.

[4878] However, traditionally the end of the line terminates a string orcharacter constant, with no error.

[4879] In traditional C, a comment is equivalent to no text at all. (InANSI C, a comment counts as whitespace.)

[4880] Traditional C does not have the concept of a “preprocessingnumber”. It considers ‘1.0e+4’ to be three tokens: ‘1.0e’, ‘+’, and ‘4’.

[4881] A macro is not suppressed within its own definition, intraditional C. Thus, any macro that is used recursively inevitablycauses an error.

[4882] The character ‘#’ has no special meaning within a macrodefinition in traditional C.

[4883] In traditional C, the text at the end of a macro expansion canrun together with the text after the macro call, to produce a singletoken. (This is impossible in ANSI C.)

[4884] Traditionally, ‘\’ inside a macro argument suppresses thesyntactic significance of the following character.

[4885] ‘-trigraphs’

[4886] Process ANSI standard trigraph sequences. These arethree-character sequences, all starting with ‘??’, that are defined byANSI C to stand for single characters. For example, ‘??/’ stands for‘\’, so “??/n” is a character constant for a newline. Strictly speaking,the GNU C preprocessor does not support all programs in ANSI Standard Cunless ‘-trigraphs’ is used, but if one ever notices the difference itmay be with relief.

[4887] One doesn't want to know any more about trigraphs.

[4888] ‘-pedantic’

[4889] Issue warnings required by the ANSI C standard in certain casessuch as when text other than a comment follows ‘#else’ or ‘#endif’.

[4890] ‘-pedantic-errors’

[4891] Like ‘-pedantic’, except that errors are produced rather thanwarnings.

[4892] ‘-Wtrigraphs’

[4893] Warn if any trigraphs are encountered (assuming they areenabled).

[4894] ‘-Wcomment’

[4895] Warn whenever a comment-start sequence ‘/*’ appears in a comment.

[4896] ‘-Wall’

[4897] Requests both ‘-Wtrigraphs’ and ‘-Wcomment’ (but not‘-Wtraditional’).

[4898] ‘-Wtraditional’

[4899] Warn about certain constructs that behave differently intraditional and ANSI C.

[4900] ‘-I directory’

[4901] Add the directory to the head of the list of directories to besearched for header file. This can be used to override a system headerfile, substituting the version, since these directories are searchedbefore the system header file directories. If one uses more than one‘-I’ option, the directories are scanned in left-to-right order; thestandard system directories come after.

[4902] ‘-I-’

[4903] Any directories specified with ‘-I’ options before the ‘-I-’option are searched only for the case of ‘#include “file”’; they are notsearched for ‘#include <file>’. If additional directories are specifiedwith ‘-I’ options after the ‘-I-’, these directories are searched forall ‘#include’ directives. In addition, the ‘-I-’ option inhibits theuse of the current directory as the first search directory for ‘#include“file”’. Therefore, the current directory is searched only if it isrequested explicitly with ‘-I.’. Specifying both ‘-I-’ and ‘-I.’ allowsone to control precisely which directories are searched before thecurrent one and which are searched after.

[4904] ‘-nostdinc’

[4905] Do not search the standard system directories for header files.Only the directories one have specified with ‘-I’ options (and thecurrent directory, if appropriate) are searched.

[4906] ‘-nostdinc++’

[4907] Do not search for header files in the C++-specific standarddirectories, but do still search the other standard directories. (Thisoption is used when building libg++.)

[4908] ‘-D name’

[4909] Predefine name as a macro, with definition ‘1’.

[4910] ‘-D name= definition’

[4911] Predefine name as a macro, with definition. There are norestrictions on the contents of definition, but if one is invoking thepreprocessor from a shell or shell-like program one may need to use theshell's quoting syntax to protect characters such as spaces that have ameaning in the shell syntax. If one uses more than one ‘-D’ for the samename, the rightmost definition takes effect.

[4912] ‘-U name’

[4913] Do not predefine name. If both ‘-U’ and ‘-D’ are specified forone name, the ‘-U’ beats the ‘-D’ and the name is not predefined.

[4914] ‘-undef’

[4915] Do not predefine any nonstandard macros.

[4916] ‘-A predicate(answer)’

[4917] Make an assertion with the predicate and answer.

[4918] One can use ‘-A-’ to disable all predefined assertions; it alsoundefines all predefined macros that identify the type of target system.

[4919] ‘-dM’

[4920] Instead of outputting the result of preprocessing, output a listof ‘#define’ directives for all the macros defined during the executionof the preprocessor, including predefined macros. This gives one a wayof finding out what is predefined in the version of the preprocessor;assuming one have no file ‘foo.h’, the command

[4921] touch foo.h; cpp -dM foo.h

[4922] may show the values of any predefined macros.

[4923] ‘-dD’

[4924] Like ‘-dM’ except in two respects: it does not include thepredefined macros, and it outputs both the ‘#define’ directives and theresult of preprocessing. Both kinds of output go to the standard outputfile.

[4925] ‘-M [-MG]’

[4926] Instead of outputting the result of preprocessing, output a rulesuitable for make describing the dependencies of the main source file.The preprocessor outputs one make rule containing the object file namefor that source file, a colon, and the names of all the included files.If there are many included files then the rule is split into severallines using

[4927] ‘\’-newline.

[4928] ‘-MG’ says to treat missing header files as generated files andassume they live in the same directory as the source file. It may bespecified in addition to ‘-M’.

[4929] This feature is used in automatic updating of makefiles.

[4930] ‘-MM [-MG]’

[4931] Like ‘-M’ but mention only the files included with ‘#include“file”’. System header files included with ‘#include <file>’ areomitted.

[4932] ‘-MD file’

[4933] Like ‘-M’ but the dependency information is written to file. Thisis in addition to compiling the file as specified -‘-MD’ does notinhibit ordinary compilation the way ‘-M’ does. When invoking gcc, donot specify the file argument. Gcc may create file names made byreplacing “.c” with “.d” at the end of the input file names. In Mach,one can use the utility md to merge multiple dependency files into asingle dependency file suitable for using with the ‘make’ command.

[4934] ‘-MMD file’

[4935] Like ‘-MD’ except mention only user header files, not systemheader files.

[4936] ‘-H’

[4937] Print the name of each header file used, in addition to othernormal activities.

[4938] ‘-imacros file’

[4939] Process file as input, discarding the resulting output, beforeprocessing the regular input file. Because the output generated fromfile is discarded, the only effect of ‘-imacros file’ is to make themacros defined in file available for use in the main input.

[4940] ‘-include file’

[4941] Process file as input, and include all the resulting output,before processing the regular input file.

[4942] ‘-idirafter dir’

[4943] Add the directory dir to the second include path. The directorieson the second include path are searched when a header file is not foundin any of the directories in the main include path (the one that ‘-I’adds to).

[4944] ‘-iprefix prefix’

[4945] Specify prefix as the prefix for subsequent ‘-iwithprefix’options.

[4946] ‘-iwithprefix dir’

[4947] Add a directory to the second include path. The directory's nameis made by concatenating prefix and dir, where prefix was specifiedpreviously with ‘-iprefix’.

[4948] ‘i.system dir’

[4949] Add a directory to the beginning of the second include path,marking it as a system directory, so that it gets the same specialtreatment as is applied to the standard system directories.

[4950] ‘-lang-c’

[4951] ‘-lang-c89’

[4952] ‘-lang-c++’

[4953] ‘-lang-objc’

[4954] ‘-lang-objc++’

[4955] Specify the source language. ‘-lang-c’ is the default; it allowsrecognition of C++ comments (comments that begin with ‘//’ and end atend of line), since this is a common feature and it may most likely bein the next C standard. ‘-lang- c89’ disables recognition of C++comments. ‘-lang-c++’ handles C + + comment syntax and includes extradefault include directories for C++. ‘-lang-objc’ enables the ObjectiveC ‘#import’ directive. ‘-lang-objc ++’ enables both C++ and Objective Cextensions. These options are generated by the compiler driver gcc, butnot passed from the ‘gcc’ command line unless one use the driver's ‘-Wp’option.

[4956] ‘-lint’

[4957] Look for commands to the program checker lint embedded incomments, and emit them preceded by ‘#pragma lint’.

[4958] For example, the comment ‘/* NOTREACHED */’ becomes ‘#pragma lintNOTREACHED’. This option is available only when one call cpp directly;gcc may not pass it from its command line.

[4959] ‘-S$’

[4960] Forbid the use of ‘$’ in identifiers. This is required for ANSIconformance. gec automatically supplies this option to the preprocessorif one specify ‘-ansi’, but gcc doesn't recognize the ‘-$’ optionitself—to use it without the other effects of ‘-ansi’, one may call thepreprocessor directly.

FPGA-Based Co-Processor API

[4961] The present section specifies in detail the performance andfunctional specification of one emobidiment of the present invention.The present section describes how the various requirements are to bemet. It also documents all the tests necessary to verify that eachHandel-C and/or software unit functions correctly and that theyintegrate to work as one complete application.

[4962] In the context of the present section, various embodiments willnow be set forth, and further elaborated upon subsequently duringreference to FIGS. 88 through 92. It should be noted that the presentembodiments are also particularly pertinent to the earlier discussionsof l parameterized macros under the heading “Parameterized macroexpressions” set forth hereinabove during reference to FIG. 57A-2 andsubsequent figures.

[4963]FIG. 87B illustrates a method 8750 for distributing cores, inaccordance with one embodiment of the present invention. In general, inoperation 8752, a core that includes a plurality of first variables isdistributed without reference to at one or more parameters. In oneaspect of the present invention, the core may be distributed over anetwork. As an option, the network may include the Internet.

[4964] In one embodiment, the one or more parameters may includevariable width. In further aspect, the one or more parameters mayinclude data type. In even another aspect, the one or more parametersmay include array size. In another aspect, the one or more parametersmay include pipeline depth.

[4965] A computer program is then executed that includes a plurality ofsecond variables with reference to the one or more parameter. Seeoperation 8754. The execution of the computer program includes executionof the core. The one or more parameters of the first variables are theninferred from the one or more parameters of the second variables. Seeoperation 8756.

[4966] By this design, the various principles disclosed herein may beused in a distributed environment where cores may be disseminatedutilizing a network, and used by various computer applications.

[4967]FIG. 87C illustrates a method 8760 for using a library map duringthe design of cores, in accordance with one embodiment of the presentinvention. In general, in operation 8762, a plurality of macros whichspecify an interface is determined. In one aspect, the macros may becompiled in a file.

[4968] During the execution of each of macro, one of a plurality oflibraries is utilized in operation 8764. Each macro is capable of beingexecuted utilizing different libraries. Note operation 8766. As anoption, the macros may be executed on a co-processor which is capable ofexecuting the macros utilizing different libraries.

[4969] In one embodiment of the present invention, a plurality of firstvariables in the macros may also be defined with reference to variablewidths, and a plurality of second variables in the macros may be definedwithout reference to variable widths so that the variable widths of thesecond variables may be inferred from the variable widths of the firstvariables.

[4970] The present invention is thus adapted for automaticallygenerating libraries for use in distributing software components withoutrequiring the software components to be completely defined. The systemreceives a behavioral description of the system components anddetermines the optimal required functionality between hardware andsoftware and provides that functionality while varying the parameters(e.g. size or power) of the hardware and/or software. Thus, forinstance, the hardware and the processors for the software can be formedon a reconfigurable logic device, each being no bigger than is necessaryto form the desired functions. The codesign system outputs a descriptionof the required processors, machine code to run on the processors, and anet list or register transfer level description of the necessaryhardware. It is possible for the user to write some parts of thedescription of the system at register transfer level to give closercontrol over the operation of the system, and the user can specify theprocessor or processors to be used, and can change, for instance, thepartitioner, compilers or speed estimators used in the codesign system.Since the library has the latest technology in dynamic widths, thelibraries are flexible in their ability to store and dynamically updatetheir components based on the characteristics of a resolved system.

[4971] In another aspect of the present invention, a set of macros isinitially developed to specify an interface. The hardware interface isthus specified using software macros. For example, a macro that says addA + B = C may translate into an adder with two input ports and anoutput.

[4972] As an option, a Handel-C file and a header file may beimplemented with the declarations for the macros or file.

[4973] Thereafter, the C file may be compiled into a library. Thevariables may not be fully resolved at this point. The Handel C compilermay do width inferencing when the library is utilized in a program call.A width of constant values or a whole expression may inferred. Externalreferences may also be made to another macro that may not be in thatparticular library that was resolved to the other library when the callwas invoked. Function pointers can also encapsulate a whole piece ofhardware which can be resolved at runtime.

[4974] For example, in a system with two memory banks connected togetherin a FPGA, a pointer may point to a function pointer. Then, suchfunction pointer can be assigned to any function and have several (i.e.seven (7)) functions that can be pointed to by the function pointer.Then, at runtime it could be resolved using the Handel-C RAM function topoint to different memory banks to implement a multiplexor to any of thememory banks.

[4975] In another embodiment, three different graphic adapters could bedefined by three different functions that were pointed to by a singlefunction pointer. Such single function pointer could be changed atruntime to point to particular circuitry of the particular adapter thatis to be executed. Also, the functionality of the device may beencapsulated in the API structure to switch between various video cards.

[4976] One example of such concept will be set forth hereinafter underthe heading “Application Layer User Function Interface.” Such exampleutilizes a FPGA co-processor to pass the structure USER API Structurecontaining function pointers.

[4977] In use, when a programmer writes an expression once, he or shedoes not need to recode it every time. A macro is provided that compilesinto a function that does something, but the programmer does not knowwhat it does. One does not need the declarations for this call. Two verydifferent functions are processed: version 1 or version 2; based onwhich one is enabled. This method is thus very configurable.

[4978] An arithmetic logic unit is defined that has a binary library anda floating point library. Then, in a co-processor system, one couldpoint to one or the other of the platforms. The design of two differentIP cores may be skipped, since two different versions of the IP Coresmay be produced for the two different libraries. The functions may beused to make calls to the various points of the hardware from thepointers in the header file.

[4979] All of this may be done in the context of a co-processor system.As shown in FIG. 90, an Application Layer library and a Physical Layerlibrary are provided. For each platform for which there is memory, onewould have a separate library that defines the particular platform andits unique memory handling techniques. Header files that accompany thelibraries may be called by User Core Implementation to access the memoryof the physical core. It should be noted that the physical layer is setforth hereinafter in greater detail in the section under the heading“Physical Layer Interface.”

[4980]FIG. 87D illustrates a method 8770 for providing polymorphismusing pointers, in accordance with one embodiment of the presentinvention. Operations are initially performed on a plurality of objectsin multiple contexts using operators, as indicated in operation 8772. Inone aspect, the operations may include video operations.

[4981] In accordance with the concept of polymorphism, differentmeanings are assigned to the operators in each of the contexts. Noteoperation 8774. To enhance such concept, the meanings are assigned tothe operators in each of the contexts using pointers. Note operation8776. In one aspect, the meanings may include functions. As an optoin,the meanings may be assigned during run-time. Further, the meanings maybe selected utilizing a multiplexer.

[4982] The present embodiment allows mapping of video cards that caneach be defined by a separate function. A single function pointer can bechanged at runtime to point to any of the various functions. In oneembodiment, a multiplexor switch may be used that points to one of anumber of the function pointers.

[4983]FIG. 87E illustrates a method 8780 for generating librariesutilizing pre-compiler macros, in accordance with one embodiment of thepresent invention. In general, in operation 8782, a library is accessedthat includes a plurality of functions. A precompiler constant is testedin operation 8784 so that one or more of the functions of the librarycan be selected based on the testing. Note operation 8786.

[4984] In one aspect, the precompiler constant may include a pluralityof versions. As an option, the version may be selected utilizing aprecompiler macro. In another aspect, the precompiler constant is testedto determine a state of an apparatus on which the functions areexecuted. In such an aspect, the state of the apparatus may be based ona current bit size.

[4985] One example of a program that would use the aforementionedlibraries is as follows://-------------------------------------------------------------------------- ------------------ #ifdef VERSION1 macro expr UnknownThing (a, b ) = (a+b); #elif defined (VERSION2) macro expr UnknownThing ( a, b) = (a@b); #endif//--------------------------------------------------------------------------

[4986] In use, a library with an unknown in it can be passed therein atcompile time to execute different functions. When a bit size goes abovea certain level, one may have to be able to process it differently. Assuch, a library is created containing different compile time functionsas separate macros. Users can set which macro is executed based on thestate of the system by testing a precompiler constant. Further, thepre-compiler macro may be used to select which version is utilized.

[4987]FIG. 87F illustrates a method 8790 for mimicking object orientedprogramming utilizing pointers in a programmable hardware architecture,in accordance with one embodiment of the present invention. Initially,in operation 8792, a structure is pointed to for executing a functioninvolving a structure. Thereafter, in operation 8794, contents of thestructure are analyzed. Further, at least one macro of a set of macrosis selected based on the analysis. See operation 8796.

[4988] When programming in C++, a person has some data with a hiddenpointer to the function. Then, whenever he or she has “this” pointer inHandel C, there is a structure of data that can be pointed at tofacilitate the function call.

[4989] Example:

[4990] A structure may be defined to have an integer therein. Two macrosare provided: one that increments and one that decrements. Based on thecontents of the integer, one can utilize the macros to provideincremental or decremental hardware. Further, one can utilize amultitude of these instances to have the macros work on the particularstructure, and emulate a hardware register.

[4991] When one opens a file in software, a handle for the file is used.Such handle may then be used for each call to the file to providecoordinated transfer of data to the file. The same type of structure maybe utilized in Handel C to facilitate transfer of data, and modificationof the data based on the correct hardware target.

[4992] This could be applied in any context. For example, a set ofmacros may be defined that contains a structure that has one or moresets of data. With respect to the User API structure, full functionpointers may be passed to something to invoke different structures andpass data back and forth. This allows something different to beexecuted.

[4993] More information regarding the foregoing concepts of FIGS. 87Bthrough 87F will now be set forth in greater detail.

[4994] An FPGA based co-processor provides a system with are-configurable sub-processor capable of providing a system with anotable performance increase. A host and client architecture may be usedto implement the co-processor system. The co-processor may functionprimarily as a client but may be capable of performing host operations,if the platform permits such operations. It may be possible for severalco-processors to exist in a system. An FPGA based co-processor may notoperate as a normal processor would. It may be capable of acting like aseparate system depending on the resources available to the FPGA. AnFPGA co-processor may also be able to perform complex operations on datawith only platform constraints restricting the data quantities handled.The operational functionality of an FPGA based co-processor may not beimplemented as sequences of instructions but by dedicated hardwarecircuits programmed into the FPGA device.

[4995] A host may make use of a client by making remote function calls.Co-processors may provide a multitude of re-configurable functionality.The functionality of the co-processor may be provided as a set offunctions. Each function may have a unique index to distinguish it fromother functions. Functions may normally be independent of each other andtotally platform independent. It may be possible for functions tointeract within a co-processor, this feature may be provided to thefunctions via a high level API. A function may have access to all sharedresources that the co-processor has available, this feature may beprovided to the functions via a high level API. To add new functionalityto a co-processor a designer may create a new function. The set offunctions available on a co-processor at any given time is entirely atthe engineers discretion.

[4996] The co-processor may be able to execute all available functionsconcurrently. This is a demonstration of the true parallelism thathardware provides. If it is required to execute the same function morethan once then multiple copies of the function may be required, eachwith a unique address. Creating multiple copies of a function is easilyachievable in Handel-C using function arrays.

[4997] Co-processor system functionality may be provided by a set ofAPIS. There may be a separate API for the host and client. Use of thetwo APIs may provide the user with total abstraction from the platform.This may allow platform independent code to be generated that interactswith the APIs. The APIs may manage all platform interaction and anycommunication protocols that are involved. Host programs may be able touse the host API to execute functions on a co-processor. The clientco-processor may receive the messages from the host and stream data viathe client API to and from its functions as required. The functions mayinteract with the client API to access co-processor resources.

[4998] A host may interact with a clients a follows:

[4999] Begin the execution of a function

[5000] Send parameters to a function

[5001] Retrieve data from a function

[5002] Receive data ready notifications from a client

[5003] Perform auxiliary functionality

[5004] A client may interact with a host as follows:

[5005] Execute a function when instructed to do so.

[5006] Stream data to a function as required

[5007] Stream data from a function as required

[5008] Send data ready signals to a host

[5009] Provide the address of the function that generated a data readysignal

[5010] Perform auxiliary functionality

[5011] The APIs may be designed to provide an abstraction layer forinterfacing software. This may allow user applications to be platformindependent in relation to the co-processor API.

[5012] Interfacing applications may not be aware of the standards orprotocols used by a host and client to communicate. The abstractionallows changes to the co-processor system to be made withoutsignificantly effecting user applications.

[5013] Host (CPU) API Specification

[5014] The host API describes the software that may interact with userapplication running on the host platform.

[5015] The host API may provide a user with all the functionality theyneed to access and utilize an -FPGA based co-processor. The host API mayrepresent an FPGA based co-processor as a set of remote functions. Theremay be sufficient functionality included in the host API to reduce theoverhead of a remote function call to a single standard local functioncall.

[5016] An application interfacing with the host API may be able toexecute remote functions using two possible methods; execute and wait orexecute and continue. The execute and wait mechanism may mimic a normalfunction call, it may not return until the remote function has completedexecution and the results have been retrieved. The execute and continuemechanism may allow several functions to be called without waiting forthe results of others.

[5017] The host API may notify user applications of events usingcall-back functions. The call-back functions may be executed when arelevant co-processor event occurs. Use of the execute and continuemechanism allows a function to produce interim results, i.e. producesmultiple completion signals with multiple data returns.

[5018] API Structure

[5019]FIG. 88 illustrates an application program interface 8800, inaccordance with one embodiment of the present invention. The host API8802 is designed to be constructed from two major sections 8804, 8806.The two sections are present to allow separation of the platformdependent code from the platform independent code. A library may bebuilt from each logical section of the design. This sectioning is doneto decrease the effort required to port the API between variousplatforms. The platform dependent section may provide a common interfaceand may functionally target the host platform. It may be possible tohave a variety of platform dependent sections available to providesupport for a variety of different target platforms.

[5020] The API design is layered to ease maintenance. Each layer mayrepresent a software library. Each library may provide a set offunctions and macros for use within the API core. Each layer may have acommon interface. Using common interfaces may increase the flexibilityof the API. The common interfaces may be used as templates for thelayers. New implementations can be based on the templates and providingthey are functionally compatible; they may be immediately compatiblewith existing systems.

[5021] Application Layer API Public Interface

[5022] The public interface provides the basic functions necessary for auser application to use an FPGA based co-processor.

[5023] Overlapped execution of remote functions can be achieved bydirectly accessing the physical layer interface and initiating datatransfers to functions.

[5024] Call-back functions are used to implement an event driven system.The call-back functions are executed to inform the user application whenevents have occurred. Call-back functions may be repeatedly useddepending on the nature of the event. The last event to be signaledbefore a transfer is completed would normally be a completion statusreport or a fatal error. Data is transferred to a call-back function inthe form of results structure.

[5025] Legacy styled remote function executions are performed usingExecuteFunction Wait. More advanced and overlapped remote functionexecution is performed using ReadData and WriteData. API PublicFunctions TransferResultsStructure ExecuteFunctionWait( unsigned intFunctionIndex, unsigned int DataAmountParameters, char*ParameterDataBuffer, unsigned int ReturnDataAmount, char*ReturnDataBuffer

[5026] Parameters

[5027] FunctionIndex

[5028] Index of the function to be executed.

[5029] DataAmountParamaters

[5030] Size of the parameter data buffer in bytes.

[5031] ParameterDataBuffer

[5032] A data buffer containing the parameters to send to the functionto be executed.

[5033] ReturnDataAmount

[5034] Size of the return data buffer in bytes.

[5035] ReturnDataBuffer

[5036] A data buffer to store the return data from the function to beexecuted.

[5037] Return Value

[5038] The structure returned from this function may contain informationabout the completion results for the remote function execution.

[5039] Remarks

[5040] ExecuteFunction Wait is used to perform a legacy styled functioncall. The specified remote function may be executed and the contents ofParameterDataBuffer may be transferred to the function. When the dataready signal from the executed from is received the return data may betransferred and stored in the ReturnDataBuffer. This method of remotefunction execution can only be used on remote functions that have atraditional execution flow. FIG. 91 shows a traditional execution flowfor a remote function. The host API may require certain tasks to beperformed before any interaction with a co-processor occurs.StartCoprocessorSystem is provided by the host API for user applicationsto initialize the host APIs subsystems.

[5041] void StartCoprocessorSystem( . . . );

[5042] Parameters

[5043] Remarks

[5044] Initializes the API, allocates required system resources. Thismay be called before any other API function.

[5045] To enable a graceful shut down the host API may provide a userapplication with a method for informing it that it is no longerrequired. ShutdownCoprocessorSystem is provided by the host API as theshutdown function for the co-processor system.

[5046] void ShutdownCoprocessorSystem( . . . );

[5047] Parameters

[5048] Remarks

[5049] Call this when the API functionality is no longer required. Thismay clean up any system resources being used by the API.

[5050] Physical Layer API Public Interface

[5051] The physical layer interface provides access to platformdependent features. The features that form the public interface for thephysical layer enable a user to perform more advanced functionality thanis possible with the application layer public interface.

[5052] As an FPGA is a re-configurable device a method for configuringthe device is required. This is provided by the physical layer publicinterface. Physical Layer Public Functions intConfigureCoprocessor( char *BitFile );

[5053] Parameters

[5054] BitFile

[5055] Name of a ‘.bit’ file to be loaded into the FPGA co-processor.

[5056] Return Value

[5057] If the function succeeds the return value is nonzero.

[5058] If the function fails the return value is zero.

[5059] Remarks

[5060] The ‘.bit’ file used may be compatible with the target FPGAdevice. The client API provides the means to design an FPGA basedco-processor in Handel-C. Compiling Handel-C code to EDIF may enable tocreation of a ‘.bit’ file using the FPGA vendors software tools.

[5061] An FPGA co-processor is capable of supporting advancedfunctionality. For a user application to use advanced features it may becapable of transferring data from a co-processor whenever it needs to.unsigned int ReadData(  TransferConfiguration Configuration );

[5062] Parameters

[5063] Configuration

[5064] A structure that contains all the required data to begin theoperation.

[5065] Return Value

[5066] The return value is a unique identifier for the operation. Theidentifier may be used during informative commumcation.

[5067] Remarks

[5068] This function may transfer data to a remote function. If thetarget function is not executing it may be executed.

[5069] An FPGA co-processor is capable of supporting advancedfunctionality. For a user application to use advanced features it may becapable of transferring data to a co-processor whenever it needs to.unsigned int WriteData(  TransferConfiguration *Configuration );

[5070] Parameters

[5071] Configuration

[5072] A structure that contains all the required data to begin thetransfer.

[5073] Return Value

[5074] The return value is a unique identifier for the transaction.

[5075] Remarks

[5076] This functions may transfer data from a remote function.

[5077] A user application may want to monitor the progress of an activetransaction. The host API provides the QueryTransaction function fortransaction monitoring purposes. TransferResultsStructureQueryTransaction( unsigned int UniqueIdentifier )

[5078] Parameters

[5079] UniqueIdentifier

[5080] The identifier is used to provide a unique handle for eachtransaction.

[5081] Return Value

[5082] The structure returned from this function may contain informationabout the transaction being queried.

[5083] Remarks

[5084] Use this function to get intermediate results for an activetransaction.

[5085] Call Back Functions and Structures

[5086] Call back functions are used throughout the API to prevent theneed for polling. The use of call back function builds an event drivensystem. When event occur the call back functions are executed tocommunicate information about the event. Typical events may be transfercompletion, error notification and timeout.

[5087] Structures

[5088] Several of the physical layer functions require configurationdata. The Configuration structure provides encapsulation for theconfiguration data. struct Configuration{ void(*TransferCallback)(TransferResultsStucture TransactionInformation);unsigned int DataQuantity; unsigned char *DataBuffer; unsigned intDestinationAddress; unsigned int MaxDesiredTransactionTime; }

[5089] Members

[5090] PhysicalLayerEventHandler

[5091] This is the call-back function that is exclusive to the physicallayer. The user should create this function and it should be based onthe function prototype TransferCallback.

[5092] DataQuantity

[5093] This value refers to the amount of data in bytes to betransferred.

[5094] DataBuffer

[5095] A pointer to a data buffer, if receiving data the buffer may beat least as big as DataQuantity bytes.

[5096] DestinationAddress

[5097] The destination address refers to the index of the function towhich the data is to be transferred.

[5098] MaxDesiredTransaction Time

[5099] This value specifies a length of time in milliseconds. It is usedto indicate the maximum desired time for a transaction. This allowstransactions to be aborted if they are taking to long.

[5100] Remarks

[5101] The configuration structure is used when calling functions in thephysical layer.

[5102] The status results for a particular function are encapsulated inthe TransferResultsStructure. This structure is commonly passed tocall-back function but is also used by QueryTransaction. structTransferResultsStructure{ unsigned int UniqueIdentifier; unsigned intQuantityOfDataTransferred; TransferResultsCodes ResultCode; }

[5103] Members

[5104] QuantityOfDataTransferred

[5105] This value is used to indicate how many bytes were successfullytransferred.

[5106] ResultCode

[5107] The result code may be on of the defined states for theenumerated data type TransferResultsCodes.

[5108] Remarks

[5109] The transfer results structure contains information about arecent transfer request.

[5110] Possible values for the status codes are pre-defined using anenumerated data type. enum unsigned int TransferResultsCodes = { CPS_COMPLETED=0, CPS_FATAL, CPS_TIMEOUT, CPS_SYSTEM_BUSY,CPS_IN_PROGRESS, CPS_ON_HOLD };

[5111] Remarks

[5112] The results codes are used by the system to indicate transactionresults.

[5113] Callback Functions

[5114] Transfer call-back functions are used to generate the eventdriven system. They are user created function that are passed to theAPI, they may be based on the TransferCallbackfunction prototype.

[5115] void TransferCallback(const TransferResultsStructureTransactionResults)

[5116] Parameters

[5117] TransactionResults

[5118] This is a structure that contains information about the reasonfor executing the call-back function. The transaction is only terminatedwhen one of CPS_COMPLETED, CPS_FATAL or CPS_TIMEOUT is indicated as theresult code.

[5119] Remarks

[5120] The transfer call back function is used as the event handler fora transaction. The user may provide this function if they requireoverlapped co-processor operations.

[5121] Host Communication with a Client

[5122] The methods a host uses to transport data to and from a clientare very platform dependent.

[5123] Client (FPGA) API Specification

[5124] The client API deals with the FPGA portion of the co-processorsystem. Everything described here refers to the hardware required toconstruct an FPGA based co-processor. Many of the descriptions used inthis section use software terminology; this is possible due to theHandel-C programming language that allows hardware to be describe interms of algorithms using a C styled syntax.

[5125] API Structure

[5126] The client API may consist of macros and functions for twopurposes.

[5127] The development of functions that a host may access and execute.

[5128] The construction of the hardware required to interact with a hostand the platform resources.

[5129] To ensure maximum maintainability the client API may be dividedinto two major sections. The two sections are created to separateplatform independent code and platform dependent code.

[5130] The platform independent section has been named the ‘applicationlayer’. The platform dependent section has been named the ‘physicallayer’. The physical layer may form the ‘physical layer library’. Theapplication layer may form the ‘application layer library’.

[5131] The client API may enable a user to create hardware for use by ahost using Handel-C and representing the hardware as Handel-C functions.The user created hardware for host use may be called ‘user functions’.One can apply software terminology to the hardware due to theabstraction that the Handle-C language provides.

[5132] The application layer may contain macros and functions that areused by user functions. The application layer may provide user functionswith a layer of total platform abstraction. This may allows userfunctions to be designed once for any platform.

[5133] A user may use the API libraries to construct a ‘physical core’and one or more ‘user cores’. The purpose of a user core may be toreference the user functions and associate the user functions withindexes using platform independent methods.

[5134] The purpose of the physical core is to provide a separate filethat a user can use to interact with the platform. This may allow thework required to port a co-processor to be limited to only minormodifications of the physical core.

[5135] Configuration of platform resources may be possible using thephysical layer of the API when creating a physical core.

[5136]FIG. 89 illustrates a schematic 8900 showing that the physicallayer 8902 is divided into a further two sections, 8904 an 8906, inaccordance with one embodiment of the present invention. Sharedresources are handled by section 2 of the physical layer and hostinteraction is handled by section 1 of the physical layer. The twosections of the physical layer are accessible through a commoninterface, the ‘physical layer platform independent interface’. A commoninterface 8908 for the physical layer is defined to ensure thatdifferent implementations (for different platforms) of the physicallayer are compatible with the application layer 8910.

[5137] User functions gain access to the application layer API via theparameters passed to the function when it is executed. This may allowthe API libraries to distinguish between user functions when APIfunction calls are made. The concept of executing a hardware functionrelates to a signal changing to indicate ‘go’.

[5138] A situation may arise where a user function requires directaccess to auxiliary I/O on a particular platform. A user function may beable to access auxiliary I/O by accessing a set of macros that formconnections to the auxiliary I/O. Use of auxiliary I/O may compromisethe portability of a user function but the auxiliary I/O system may bedesigned to minimize the impact.

[5139] The purpose of the physical layer is to provide some abstractionfrom platform features. This may allow the application layer to expectfrom the physical layer a common interface with relatively commonfeatures. A physical layer library may be constructed for each targetplatform. The physical layer libraries may contain a set of macro basedon the common template provided in the design of the physical layer. APIusers may be able to create the top level of a co-processor using therelevant API physical layer library.

[5140] Use of the Client API Libraries

[5141]FIG. 90 is a schematic diagram 9000 of the application layer 9002,physical layer 9004, and user domain 9006, in accordance with oneembodiment of the present invention.

[5142] The API libraries may provide layers of abstraction to make aco-processor as portable as possible. Any parts of the system thatinteract solely with the application layer can be considered totallyplatform independent as the application layer is its self totallyplatform independent.

[5143] User functions may interact with the application layer libraryusing the user function header. The user function header may prototypethe API functions that may be passed to a user function when it isexecuted. The API functions may be encapsulated into a structure and thestructure may be passed to the user function when it is executed. Thismechanism is used to ensure forward compatibility and to allow theapplication layer library to distinguish between the user functions inthe most efficient method possible.

[5144] User cores may be created using the macros of the applicationlayer library. When a user core is constructed the user may referencethe functions that may form the co-processor functionality. Indexes maybe assigned to the user function when the user core is created.

[5145] The purpose of the physical layer library is to provide a commoninterface to the platform features. Due to the possible diversity of thefeatures a platform may provide a physical layer library may be createdfor each platform to be supported. This may allow people who use thephysical layer libraries to do so knowing that their co-processor may beeasy to port to other platforms. The physical layer library is veryplatform dependent but is intended to enable a user to create a physicalcore that is not very platform dependent.

[5146] Creation of a Co-Processor

[5147] To build a co-processor may require a user to generate severalfiles using the API libraries. User functions may use the APIs userfunction header to access API functionality. The actual co-processor isbuilt by the user creating a physical core. The user may also create atleast one user core to accompany the physical core. The user core may beplatform independent as it may only interact with the application layersection of the API. The physical core may be classed as platformdependent but the API may provide some abstraction via the physicallayer allowing rapid porting of a physical core. The user can configurevarious features of a platform during creation of a physical core.

[5148] The physical core forms the top level for a co-processorimplementation.

[5149] File Associations

[5150]FIG. 90 shows the files that may be required to construct an FPGAbased co-processor, in accordance with one embodiment of the presentinvention.

[5151] The uses tags are numbered to allow explanation of theinterconnections. See FIG. 90 for the numbered uses tags. A usersphysical core may include the physical layer library header to gainaccess to the physical layer library. The physical layer library headermay contain declarations that may reference the public contents of thephysical layer library. A user physical core should include the systemconfiguration header file. When the physical layer library macros areused they may use the configuration data. A users physical core may beable to link to a number of user cores. A users physical core mayprovide a user core with a clock and the relevant functionality toenable its operation. The user function header links to the physicallayer library to gain access to the names of the auxiliary I/O ports. Auser core may include the application layer library header to gainaccess to the macros in the application layer library.

[5152] The application layer library header may contain declarationsthat link to the macros and functions in the application layer library.The user function header links to the application layer library toprovide user function with access to the API. A user core may link to atleast one user function. User functions may include the user functionheader to gain access to the API.

[5153] Co-Processor System Configuration

[5154] The co-processor API support a large amount of configurationoptions. The platform configuration may be performed when the user iscreating the physical core. The user function index maps are createdwhen the user create a user core.

[5155] Common platform configuration options supported by the physicallayer library (used when a user is creating a physical core):

[5156] Number of functions supported. This may configure how the hostaddress decoder is built. The size and speed of the address decoder isdependent on the number of functions to be supported. The addressdecoder may use advanced techniques therefore the index map seen by thehost may not be incremental. Address to function index maps may bedefined.

[5157] Configuration of the platform memory banks. Memory banks may beconnected directly to the FPGA or accessed via the local bus. Thephysical layer library may manage any specifics. The user may be able toconfigure which memory banks map to which functions and if the memorybanks are shared or dedicated.

[5158] Type and buffering mode of the mailboxes.

[5159] Type and size of message queue to the host.

[5160] Common configuration options supported by the application layerlibrary (used during the creation of a user core):

[5161] Index associated with a particular function.

[5162] Application Layer User Core Creation InterfaceCoProcessorInitialiseSystem( UserCoreName ):

[5163] Parameters

[5164] UserCoreName

[5165] This should be a unique name for the user core. This name may bereferenced by the physical core to make the necessary connection.

[5166] Remarks

[5167] This is pre-compiler macro, it should be used at global scope. Itshould be used after the header is included for the application layerlibrary. It performs the necessary definitions and declarations requiredfor the user core. CoProcessorAssociateFunction(  UserCoreName,FunctionIndex, FunctionPointer ):

[5168] Parameters

[5169] UserCoreName

[5170] This should be a unique name for the user core. This name may bereferenced by the physical core to make the necessary connection.

[5171] FunctionIndex

[5172] This is the index that may be used by the host to transfer datato the function being configured.

[5173] FunctionPointer

[5174] This is a pointer to the user function that is being associatedwith the specified index.

[5175] Remarks

[5176] This is a Handel-C macro procedure. It should be called withinthe main function of a user core. It is used to assign an index to afunction. CoProcessorStart(  UserCoreName );

[5177] Parameters

[5178] UserCoreName

[5179] This should be a unique name for the user core. This name may bereferenced by the physical core to make the necessary connection.

[5180] Remarks

[5181] This is a Handel-C macro procedure. It should be the last callmade to any of the co-processor system macros. It should be located inthe main function of a user core. It may become the main handler for allphysical core interaction. This macro may never return as it may containa forever loop.

[5182] Application Layer User Function Interface

[5183] The majority of the application layer API is provided to a userfunction via a parameter passed to the user function when it isexecuted. The parameter is a structure. The structure may contain a setof function pointers. Passing a structure to a user function allow forforward compatibility. If at a later stage more functions need to beadded to the API this can be done without effecting existing userfunctions.

[5184] The API may expect a user function prototype to look like this:void UserFunctionName(  USER_API ParameterName );

[5185] Remarks

[5186] UserFunctionName and ParameterName can be replaced with any legalC styled name.

[5187] This is the USER API structure: typedef struct{ void(*CoProcessorSetAddress)(unsigned int 32 Address, unsigned int 1ReadOrWrite); void (*CoProcessorDoTransfer)(unsigned int 32 *Data); void(*CoProcessorGetData)(unsigned int 32 *Data); void(*CoProcessorSendData)(unsigned int 32 Data); void(*CoProcessorNotifyDataReady)( ); unsigned int 1(*CoProcessorCheckForPost)( ); unsigned int 32(*CoProcessorGetSendersAddress)( ); void(*CoProcessorSetPostAddress)(unsigned int 32 Address); void(*CoProcessorDoPostDataRead)(unsigned int 32 *Data); void(*CoProcessorDoPostDataWrite)(unsigned int 32 Data); } USER_API;

[5188] Members

[5189] CoProcessorSetA ddress

[5190] This function is used to initiate a memory data transfer. Itallows an address to be set and the direction of the transfer to beconfigured. Memory access is pipe-lined and it takes more than one clockcycle for a transaction to be completed. Separation of the address anddata phase allows burst mode transactions to be performed. The exactnumber of cycle it takes for a memory operation is dependent on theplatform; to compensate for this CoProcessorDoTransfer may always ensuresynchronization between the memory address phase and data transfer. Theaddress phase is buffered to enable one address value to be written forevery available memory address cycle. CoProcessorSetAddress may block ifit has been called to many times before a call of CoProcessorDoTransfer.Memory address and data phases can be interleaved to provide a highmemory bandwidth.

[5191] CoProcessorDoTransfer

[5192] This function is provided to handle the data phase of a memoryaccess. It may block until a previous address phase has completed.

[5193] CoProcessorGetData

[5194] CoProcessorGetData gives a user function the ability to retrievedata from the host. This function may block until the host sends data.

[5195] CoProcessorSendData

[5196] CoProcessorSendData gives a user function the ability to senddata to a host. This function may block until the host requests datafrom the function.

[5197] CoProcessorNotifyDataReady

[5198] This function is used by a user function to notify the host thatdata is ready. This may be used as required and is not restricted toonly meaning data is ready.

[5199] CoProcessorCheckForPost

[5200] Used by a user function to test for the presence of post in themailbox.

[5201] CoProcessorGetSendersAddress

[5202] Used to get the address of the sender of the data currently inthe mailbox. This function should be called in parallel with or beforeCoProcessorDoPostDataRead.

[5203] CoProcessorSetPostAddress

[5204] Initiates the sending of mail. The address is configured for thesending and the next data to be sent may be forwarded to the addressspecified here.

[5205] CoProcessorDoPostDataRead

[5206] Gets data from the mailbox. This function may block if no data iswaiting.

[5207] CoProcessorDoPostData Write

[5208] Sends data to a previously specified address. If an address hasnot been specified this function may block until an address isspecified.

[5209] Remarks

[5210] The only part of the user API that is not provided to functionsthrough the structure is access to auxiliary I/O. Macros are used toestablish the links between a user function and auxiliary I/O. Thismethod is used to allow a function direct access to auxiliary I/O withno interference from the core of the client API. Access to auxiliary isdeemed to be necessary as the nature of the devices connected toauxiliary is unknown to the API.

[5211] User API functions in detail: void CoProcessorSetAddress(unsignedint 32 Address, unsigned int 1 ReadOrWrite );

[5212] Parameters

[5213] Address

[5214] The address parameter represents the memory location for thetarget operation.

[5215] ReadOrWrite

[5216] Indicates the mode for the memory operation, an active highsignal indicates a read operation.

[5217] Remarks

[5218] CoProcessorSetAddress is used to initiate a memory accessoperation. Memory access operations are separated into the address phaseand data phase. The phase separation allows the system to achievemaximum bandwidth utilization. void CoProcessorDoTransfer(  unsigned int32 *Data );

[5219] Parameters

[5220] Data

[5221] A pointer to a register. The register may be loaded or readdepending on the mode selected during the synchronized address phase.Synchronisation is performed by the system.

[5222] Remarks

[5223] CoProcessorDotransfer is used to perform the data phase for amemory access operation. This function may automatically synchronizewith the address phase. void CoProcessorGetData( unsigned int 32 *Data);

[5224] Parameters

[5225] Data

[5226] A pointer to a register, the register is loaded with a dataparameter sent by the host.

[5227] Remarks

[5228] CoProcessorGetData may lock until data has been sent by the hostand target for the user function using its copy of this function. voidCoProcessorSendData( unsigned int 32 Data );

[5229] Parameters

[5230] Data

[5231] The data that may be transferred to the host when it requestsdata from the user function.

[5232] Remarks

[5233] CoProcessorSendData may block until the host request data fro theuser function using its copy of this function.

[5234] void CoProcessorNotifyDataReady( );

[5235] Remarks

[5236] A user function should use this function to notify the host thatis wants to perform a data transfer operation.CoProcessorNotifyDataReady may send some form of interrupt to the host,the signal may be queued if other user functions are signaling at thesame time.

[5237] unsigned int 1 CoProcessorCheckForPost( );

[5238] Return Value

[5239] The return value is active high to indicate that post is waiting.

[5240] Remarks

[5241] CoProcessorCheckForPost is used for testing the incoming mailboxfor any contents.

[5242] unsigned int 32 CoProcessorGetSendersAddress( );

[5243] Return Value

[5244] The return value may be the function index of the function thatsent the mail waiting in the mailbox.

[5245] Remarks

[5246] The mailbox is only emptied by CoProcessorDoPostDataReadtherefore repeated calls to CoProcessorGetSendersAddress may return thesame result until the waiting mail has been retrieved. This function mayblock if there is no mail waiting in the mailbox. voidCoProcessorSetPostAddress(  unsigned int 32 Address );

[5247] Parameters

[5248] Address

[5249] The address parameter represents the user function index for theuser function that may receive mail sent by the user of this function.

[5250] Remarks

[5251] CoProcessorSetPostAddress configures the destination address forthe mail to be sent. This function only needs to be used to set thedestination at the beginning of a multi message transfer.

[5252] This should be called a clock cycle before writing the dataintended for the address being programmed. voidCoProcessorDoPostDataRead( unsigned int 32 *Data );

[5253] Parameters

[5254] Data

[5255] This is a pointer to a register. The data from the mailbox may bewritten to the pointed to register.

[5256] Remarks

[5257] CoProcessorDoPostDataRead may send mail to the address that iscurrently configured. It is not necessary to set the address for everymail message sent, the previous address may be used. voidCoProcessorDoPostDataWrite( unsigned int 32 Data );

[5258] Parameters

[5259] Data

[5260] This may be sent to the addressed user functions mailbox.

[5261] Remarks

[5262] CoProcessorDoPostData Write may send mail to the address that iscurrently configured. It is not necessary to set the address for everymail message sent, the previous address may be used.

[5263] This function may block if the recipients mailbox is full. Thecapacity of the mailbox is platform dependent.

[5264] API User Function Interface (Auxiliary I/O)

[5265] Auxiliary I/O is provided to user functions to allow a user totake advantage of any platform features that are outside the scope ofthe application API. These features are represented as the pinconnections that the external features/devices are connected to. The APImay make no attempt to translate or shield the user from auxiliary I/O.Access to auxiliary is direct and provided on an ‘as is’ basis.

[5266] The application layer API provides access to auxiliary via a setof macros. Auxiliary I/O ports are named and may be platform specific.The definitions for auxiliary I/O is stored in the physical layerlibrary. The auxiliary I/O section of the application layer API providesaccess to the physical layer library information. When physical layerlibraries are created the details of auxiliary I/O should be publishedwith the library. The application layer provides access to the physicallibrary in this way in an attempt to reduce the amount of effortrequired to port user functions that are dependent on platform specificfeatures.

[5267] Auxiliary I/O should only be used in a direct one to onerelationship with a user function. If more than one user functionrequires access to a shared resource a service user function should bedeveloped. Other user functions can then communicate with the serviceuser function using the mail box system. This may make only the servicefunction directly dependent on the auxiliary I/O, thus reducing theamount of effort required during porting.

[5268] These are the API auxiliary access macros:CoProcessorConnectReadAUX(  PortName )

[5269] Parameters

[5270] PortName

[5271] This should be the name of an I/O port.

[5272] Remarks

[5273] This is a pre-compiler macro and should be used at global scope.It declares a read port for auxiliary I/O. This macro may write afunction that provides the functionality to read the named I/O port.Access to the function is provided by the CoProcessorAuxRead macro.CoProcessorConnectWriteAUX(  PortName )

[5274] Parameters

[5275] PortName

[5276] This should be the name of an I/O port.

[5277] Remarks

[5278] This is a pre-compiler macro and should be used at global scope.It declares a write port for auxiliary I/O. This macro may write afunction that provides the functionality to write to the named I/O port.Access to the function is provided by the CoProcessorAuxWrite macro.CoProcessorConnectReadWriteAUX(  PortName )

[5279] Parameters

[5280] PortName

[5281] This should be the name of an I/O port.

[5282] Remarks

[5283] This is a pre-compiler macro and should be used at global scope.It declares a port for auxiliary I/O that is bidirectional. This macromay write functions that provide the functionality to read, write andset the output buffer mode for the named I/O port. Access to thefunction is provided by the CoProcessorAuxRead, CoProcessorAuxWrite andCoProcessorAuxSetEnable macros. CoProcessorAuxRead(  PortName, unsignedint *Data )

[5284] Parameters

[5285] PortName

[5286] This should be the name of an I/O port.

[5287]

[5288] Data

[5289] This is a pointer to a register. The register may be loaded withthe value currently on the I/O port.

[5290] Remarks

[5291] This is a Handel-C macro expression. It is created when readfunctionality is required on an auxiliary I/O port. The bit width ofData may match the port width. The bit width of the port can bedetermined using the CoProcessorPort Width macro.AuxSetWriteReg(  PortName, unsigned int *Data )

[5292] Parameters

[5293] PortName

[5294] This should be the name of an I/O port.

[5295] Data

[5296] Data should be a pointer to a register.

[5297] Remarks

[5298] This is a Handel-C macro procedure. It is created when writefunctionality is required on an auxiliary I/O port. The bit width ofData may match the port width. The bit width of the port can bedetermined using the CoProcessorPortWidth macro. Data may become theoutput for the I/O port. CoProcessorAuxSetEnable( PortName, unsigned int1 Enable )

[5299] Parameters

[5300] PortName

[5301] This should be the name of an I/O port.

[5302] Enable

[5303] The enable signal is used to set the mode for the output buffers.If Enable is active the output buffers are set to a high impedance more.

[5304] Remarks

[5305] This is a Handel-C macro procedure. It is created when read andwrite across an auxiliary I/O port is required. This macro is used toaccess the enable function created when I/O is mapped.CoProcessorPortWidth(  PortName )

[5306] Parameters

[5307] PortName

[5308] This should be the name of an I/O port.

[5309] Remarks

[5310] This is a pre-compiler macro that can be used anywhere. It is autility macro that allows access to the width of an auxiliary 1/0 port.This is useful when defining variables that connect to a port.

[5311] Physical Layer Interface

[5312] A common interface for the physical layer is defined to ensurethat all implementations of the physical layer are compatible with theapplication layer. The physical layer interface may allow theconfiguration and creation of the hardware necessary to manage:

[5313] Memory

[5314] Primary bus interface

[5315] System clock synchronization

[5316] Co-Processor Construction Macros

[5317] CoProcessorBuild( )

[5318] Remarks

[5319] This is a pre-compiler macro. It should only be used at globalscope. It may construct the necessary connection to the local bus andany other platform specific definitions.

[5320] CoProcessorActivate( )

[5321] Remarks

[5322] This is a Handel-C macro procedure. It should only be used atlocal scope, preferable in a main function. It may activate any platformspecific background handler tasks.CoProcessorSetUserCoreClock(  UserCoreName, UserCoreClockSource )

[5323] Parameters

[5324] UserCoreName

[5325] This may be the name given to a user core when it was created.

[5326] UserCoreClockSource

[5327] This may be one of the available clock sources defined for theplatform.

[5328] Remarks

[5329] This is a pre-compiler macro. It should only be used at globalscope. This macros may be used to configure the clock source for a usercore. It may be possible for a user core to be clocked at a differentrate to the physical core. This is only possible if the physical layerlibrary for the target platform provides more than one clock source.CoProcessorCreateUserFunctionPort(  UserCoreName, DesiredHostAddress,UserCoreFunctionIndex, InitialMemoryAccessController, PostalAddress )

[5330] Parameters

[5331] UserCoreName

[5332] This may be the name given to a user core when it was created.

[5333] DesiredHostAddress

[5334] This is the address that an external host may use to access theuser function being setup.

[5335] UserCoreFunctionIndex

[5336] This is the unique index that is used internally by the user coreto identify the user function.

[5337] InitialMemoryAccessController

[5338] This is the index of the memory access controller that may beinitially associated with the user function.

[5339] PostalAddress

[5340] This is a unique identifier that other user functions can use tosend messages to the user function being configured.

[5341] Remarks

[5342] Any user function that is to be used by a host may be setup usingthis macro.

[5343] Memory Bank Construction Macros

[5344] The memory ports are constructed in the physical core. Thisallows the memory access controllers to run faster than the userfunctions. Memory controllers are device specific and may be configuredas dedicated or shared. When a memory bank is shared the number of portsto be created may be defined.

[5345] Memory management units may be constructed by referencing a banksname. The names given to the memory banks may be a platform constant andmay be located in the physical layer library. Memory banks should beconstructed before the system handlers are initiated.CoProcessorBuildDedicatedMemoryController(  BankName,MemoryBankUniqueIdentifier )

[5346] Parameters

[5347] BankName

[5348] The name of a memory bank.

[5349] MemoryBankUniqueIdentifier

[5350] This is a unique identifier for the memory bank. It may berequired when configuring user functions.

[5351] Remarks

[5352] This macro is a pre-compiler macro. It should only be used inglobal scope. It constructs a set of functions that form a memorymanagement unit for the named memory bank. The method of memorymanagement used is single port exclusive therefore the MMU is a simpletransaction sequencer.CoProcessorBuildMultiPortMemoryController(  BankName, NumberOfPorts )

[5353] Parameters

[5354] BankName

[5355] The name of a memory bank.

[5356] NumberOfPsorts

[5357] The number of ports to generate. This represents the number ofduplicate functions to create.

[5358] Remarks

[5359] This macro is a pre-compiler macro. It should only be used inglobal scope. It constructs a set of functions that formn a memorymanagement unit for the named memory bank. A multi-port MMU isconstructed that sequences memory requests and provides simplearbitration for the available ports. Semaphores are created and theaccess functions are constructed as an array of functions.CoProcessorSetPortUniqueIdentifier(BankName, BankPortIndex,UniqueIdentifier )

[5360] Parameters

[5361] BankName

[5362] This is the name of the bank that is being referred to.

[5363] BankPortIndex

[5364] This refers to the particular port on the multi-port mem

[5365] UniqueIdentifier

[5366] Remarks CoProcessorActivateMMU( BankName )

[5367] Parameters

[5368] BankName

[5369] The name of a memory bank.

[5370] Remarks

[5371] This macro should be called within the main function of thephysical core. It starts any background memory management functions thatmay be required.

[5372] Physical Layer, Connection to Host

[5373] The actual data transfer between a host and an FPGA is platformspecific and is beyond the scope of the specification for the API for aco-processor.

[5374] The physical layer may not be restricted to using any particularmethod or protocol for communicating with a host. The only constraint isthat the host may be capable of ‘speaking the same language’ as the FPGAco-processor.

[5375] The link between a host and client may be capable of performingseveral basic signaling functions:

[5376] An address should be associated with any data transferred.

[5377] A client may be able to send a signal to a host to inform thehost that the client is ready to perform some form of data transfer.

[5378] In an environment where more than one host is present the clientmay be able to distinguish between each host and have the capability ofsignaling to a host exclusively and directly.

[5379] If client mode host functionality is required the client may beable to request access to the data transfer medium.

[5380] Physical Layer, Shared Resources

[5381] The API may provide management of the shared resources. This mayprimarily involve mutual-exclusion enforcement. Further extensions mayprovide features such as a static or a dynamic MMU. The auxiliarycommand system may provide access to features such as bank switching ora dynamic MMU.

[5382] Auxiliary I/O may be provided via an I/O mapped system. Userfunctions may use a set of macros to generate functions to access agiven auxiliary I/O port. Auxiliary I/O ports may be defined in theheader file provided for accessing the user function macros. Whendeveloping a new platform auxiliary I/O should be named and the portsdefined in the physical library.

[5383] When building a co-processor one step may be to configure themethod used to access any available memory banks. This configurationstep may usually only be done once for a platform unless the memory bankconfiguration needs to be changed. It may be possible to configure amemory bank as a dedicated bank or a multi-port bank. The option for adedicated or multi-port RAM bank is given to allow a function to haveexclusive access to a memory bank or to allow several functions to shareaccess to a memory bank. When the library for a new platform isdeveloped each memory bank may be given a unique name.

[5384] Getting Data From the Host

[5385] A client is not capable of requesting data from the host. Aclient function can use the GetData function to wait for the host tosend data. The GetData function may block until the host transfers datato the client function.

[5386] Sending Data to the Host

[5387] A client cannot directly initiate a data transfer to the host. Aclient can notify the host that it has data ready to transfer.NotifyDataReady( ) is used to get the attention of the host. A clientcan never initiate a data transfer, using the notification function maysignal to a host that one wants to transfer data. How the hostinterprets the signal is dependent on the host application.

[5388] Use SendData to perform the actual transfer of data. Thisfunction may block until the provided data has been transferred. Datatransfers are never initiated by the client. This function shouldnormally only be used after sending a data ready notification.

[5389] Inter-Function Communication

[5390] A function may be capable of sending a message to anotherfunction. To do this a function may need to know the address of thedestination function. Inter function communication is achieved usingmailboxes. A mailbox is a pair of registers. One register may be usedfor sending mail and the other for receiving mail. A flag may be used toindicate when new mail has arrived. A function should monitor the flagto determine when mail has arrived. The flag may be active when new mailis in the mailbox. If after a read of the mail box the flag is stillhigh then new mail has already arrived i.e. the flag is an active highsignal.

[5391] Client Mode Host Operations

[5392] A user function can perform host type operations. The hostoperating mode is enabled using the mail delivery system. Posting amessage to address zero may allow a function to execute a function as ifit were a host. The data may represent the index of the function thatmay receive -communication. This functionality may also allow the remoteexecution of a function; providing that the platform supports this typeof operation. The MSB of the data is used to distinguish betweeninternal and remote function executions. If the MSB is set then remoteexecution mode is selected. Once this posting has been sent the SendDataand GetData functions may be re-directed to the specified function. Torestore normal operation of the SendData and GetData functions a messageshould be posted to address zero with the data set to zero.

[5393] Co-Processor User Functions

[5394] A co-processor function may be self contained within the Handel-Cfunction construct. A function may interact with the system via theclient user API. Every user function may accept the same parameter. Theparameter may be a pointer to a structure that contains pointers to theuser API functions. The only exception may be auxiliary I/O access. Fora function to gain access to auxiliary requires that the auxiliary I/0macros are used, the only part of the API that is publicly visible to afunction.

[5395]FIG. 91 shows a typical execution flow 9100 for a function. Uponexecution the function gathers its parameters in operation 9102, it thenperforms a processing operation 9104 and returns the results to the hostin operation 9106. FIG. 91 is only an example since the functions do nothave to execute in this manner.

[5396] Host and Client Interaction Specification

[5397] The particular protocol used when a host and client communicateis not constrained. What is specified is the meanings of the messagesthat are communicated between a host and a client.

[5398] Basic Message Format

[5399] A host may always be the master in a communication, therefore ahost may always initiate a data transfer between a host and a client.All messages from a host to a client may consist of an address with somedata. The only messages that a client can send to a host is an attentionmessage , this message may carry no address or data. Host messages maybe data read operations or data write operations. The host can use anaddress of zero and the address MSB to send auxiliary commands to aclient (see 0).

[5400] Address Zero

[5401] Address zero is reserved for system use. Address zero is the onlyaddress that is reserved by the system and cannot be used as an indexfor a client user function. A host may use address zero to query theclient when an attention message is received.

[5402] A host may send a read message with address zero to a client toretrieve the reason for the attention message sent by the client. Thedata read from the client may be an address, the MSB of the address isused as a modifier (see 0).

[5403] Address zero is used internally in a client to distinguishbetween function indexes and internal system requests.

[5404] A host can use address zero with the MSB modified (see 0) torepresent an auxiliary command.

[5405] Address MSB

[5406] Typically the address MSB may be used as an address modifier bit.If a platform supports an alternative method of achieving the followingthen the address MSB can be used for regular use. The address MSB isused to modify the meaning of the address.

[5407] A host uses the MSB modifier (set to ‘1’) in conjunction withaddress zero to distinguish between signal reason requests and auxiliarycommand.

[5408] A host does not use the MSB modifier in normal communications sothe MSB should be set to ‘0’.

[5409] A client uses the MSB modifier internally to distinguish betweeninternal (MSB set to ‘0’) addresses and external (MSB set to ‘1’)addresses.

[5410] Detailed Design

[5411] Host and Client Interaction

[5412] The communication protocol used to transfer between host andclient is not constrained by this design. This design does constrain themeaning of the data transferred between host and client.

[5413] The actual method of data transfer used between a host and clientmay depend on the system platform.

[5414] The host may see the FPGA co-processor as an addressable device.An FPGA co-processor device may treat each available address as a dataport. To interact with an FPGA co-processor device the host may read andwrite data to the available ports.

[5415] Basic requirements for host/client communication:

[5416] Address zero is reserved for system use.

[5417] An address may always be associated with data.

[5418] An address always refers to an existing function index or addresszero.

[5419] The most significant 8 bits of the address is reserved by thesystem and is used as an address modifier bit.

[5420] Data Transfer Mechanism

[5421] To transfer or read data from a function a host should perform aread or write operation to the address it requires data from. Datashould be streamed to an address as an address does not representregisters i.e. repeated reads or writes to the same address. Addressesshould not be incremented when data is read or written as this wouldaddress other functions.

[5422] Parameters are passed to a function by performing a writeoperation to the functions index address. Data is returned from afunction by reading from the functions index address. The amount of datatransferred is dependent on the design of the co-processor function. Thehost may know how much data to transfer or use the data beingtransferred to indicate how much data may be transferred.

[5423] Read and write operations can be interrupted and resumed by thehost at any time. This is possible due to the slave nature of a clientdevice. Remote functions on the co-processor may wait while the transferis suspended.

[5424] Host to Client Addressing Mechanism

[5425] The address space of an FPGA co-processor is used to stream datato the function residing on the co-processor. Each function on theco-processor may have a unique address assigned to it. The designationof the addresses it at the designers discretion. The only address thatcannot be used for a function identifier is address zero. Address zerois reserved for system use.

[5426]FIG. 92 shows a typical address packet 9200. The most significant8 bits of an address is ignored by the client address decoders and usedas a command byte instead. The most significant 8 bits of an addresspacket are used as an address modifier. All other bits are available foruse as function indexes. See Table 3. TABLE 3 Value of Address MeaningModifier Interpretation of Data Address Zero Commands (used when writingto address zero) Set FIFO trigger level 1 Value for desired triggerlevel Query function status 2 Function address Set interrupt timeouttimer 3 Time in μs Address Zero Commands (used when reading from addresszero) Service required 1 Function address Function available 2 Functionaddress Function busy 3 Function address

[5427] Currently the address modifier is not used when using an addressvalue other than zero. When using an address value other than zero theaddress modifier should be set to zero.

[5428] Address Zero

[5429] Address zero is reserved for system use. This is the only addressthat cannot be used for a user function index.

[5430] Address Zero Functionality:

[5431] See Table 4. TABLE 4 Address Host Number Mode Description ofaddress usage 0 Read Read function status FIFO. The function status FIFOmay contain messages from functions to the host. A message is a singledouble word; 32 bits of data. The messages from the client should beinterpreted as an address packet using the status modifiers to interpretthe address modifier data. 0 Write The host can use the address modifierdata to send command to the client.

[5432] Arbitration in a Multiple Host Environment

[5433] Client Message Signal/Interrupt

[5434] The client has only one mechanism to signal to the host. This maybe in the form of an interrupt. The client may use this interrupt topass two different messages to a host. When a host receives theinterrupt it may query the client to determine the reason for theinterrupt.

[5435] The host queries the client by reading from address zero. Thismay read the client message FIFO. The data sent by the client may be inthe form of address packets with the client to host interpretation ofthe address modifier bit. The client may transmit zero valued addresspackets when there is no more data to read from the client message FIFO.Messages from the client may either be data ready messages fromfunctions or function available messages from the client.

[5436] Host API

[5437] The specification provides details about functional interfacesfor the components of the host API.

[5438] Physical Component Library

[5439] The physical component library is platform and protocoldependent. A basic outline for a physical library implementation may begiven here.

[5440] For a host to communicate with a client first requires that theclient is first made to listen to the communication. This may beimplemented by activating the co-processors chip select line or anotheralternative method. Once the co-processor is listening the host maytransfer an address. Once a host has transmitted an address it maytransfer data according to the host client interactions protocol.

[5441] Unique identifiers may be assigned to each function for progressmonitoring.

[5442] API Public Interface Library

[5443] Deadline Scheduling of Communication Requests

[5444] It is possible to have several active data transfers at one timeand it is possible to interrupt a transfer.

[5445] Client API

[5446] Physical Core and User Core Linkages

[5447] The co-processor construction libraries contain all the softwareneeded to construct the framework of a co-processor. MPRAMs to allowfast data transfer between the physical core and user core(s) whenrunning in different clock domains.

[5448] An exemplary floating and fixed point library will now be setforth along with information on waveform analysis.

Fixed and Floating Point Library

[5449] The Handel-C Floating Point Library provides floating-pointsupport to applications written with the Handel-C developmentenvironment.

[5450] Features of the Floating Point Library according to a preferredembodiment include the following:

[5451] Zero-cycle addition, multiplication and subtraction.

[5452] Contains useful operators such as negation, absolute values,shifts and rounding.

[5453] Supports numbers of up to exponent width 15 and mantissa width63.

[5454] Supports conversion to and from integers.

[5455] Provides square root functionality.

[5456] The Floating Point Library can be used to provide the followingapplications:

[5457] Floating precision DSP's.

[5458] Vector matrix computation.

[5459] ‘Real World’ applications.

[5460] Any computation requiring precision.

[5461] In the Library, variables are kept in structures whose widths aredefined at compile time. There are three parts to the structure; asingle sign bit, exponent bits whose width is user defined upondeclaration, and mantissa bits, also user defined. The ‘real’ value ofthe floating point number may be:

[5462] (−1)^(sign).2^((exponent-bias)).(1.mantissa)

[5463] Where the bias depends on the width of the exponent.

[5464] In use, floating point variable widths are set by usingdeclaration macros at compile time.

[5465] Illustrative declaration macros are set forth below.

[5466] The library is used by calling one of the zero cycle macroexpressions.

[5467] a = FloatAdd( b, c );

[5468] Multi-cycle macros are called in a different way.

[5469] FloatDiv(b, c, a);

[5470] The macros are not inherently shared; they are automaticallyexpanded where they are called. If extensive use of some of the macrosis required, it is advisable to share them in the following manner. Forzero-Cycle macros: shared expr fmul_1 (a, b) = FloatMult (a, b) ; sharedexpr fmul_2 (a, b) = FloatMult (a, b) ; For multi-cycle macros: voidfdiv1 (  FLOAT_TYPE *d, FLOAT_TYPE *n, FLOAT_TYPE *q) {  FloatDiv (*d,*n, *q) ; }

[5471] There will now be defined two zero-cycle multipliers and onedivider. All the usual precautions on shared hardware may now be taken.

[5472] The following table, Table 5, provide performance statistics forvarious illustrative embodiments.

[5473] Altera Flex 10K30A FPGA. TABLE 5 Max Float Size CLB Clock(exp/mant) Slices Speed FloatAdd 6/16 1205 9.46 FloatMult 6/16 996 9.38FloatDiv 6/16 390 22.02 FloatSqrt 6/16 361 18.21 FloatAdd 8/23 1328 6.53FloatMult 8/23 1922 7.05 FloatDiv 8/23 528 16.80 FloatSqrt 8/23 50513.47 Xilinx Virtex V1000-6 FPGA. FloatAdd 6/16 799 33.95 FloatMult 6/16445 30.67 FloatDiv 6/16 348 39.61 FloatSqrt 6/16 202 32.93 FloatAdd 8/231113 33.95 FloatMult 8/23 651 28.79 FloatDiv 8/23 459 36.72 FloatSqrt8/23 273 38.31

[5474] The program files that make up this Library and their purpose areset forth below. Filename Purpose Float.h Prototypes the macros to theuser Float.lib Stores the functionality of the library

[5475] Illustrative macros that may be defined in the Handel-C code arepresented in the following table. Macro Name Type Purpose FLOAT # defineSets the widths of a Floating-point variable FloatAbs Macro Returnsabsolute value of a Floating-point expression number FloatNeg MacroReturns negation of a Floating-point number expression FloatLeftShiftMacro Left shifts a Floating-point number expression FloatRightShiftMacro Right shifts a Floating-point number expression FloatRound MacroRounds the mantissa of a Floating-point expression number FloatConvertMacro Changes a Floating-point number's width expression FloatMult MacroMultiplies two Floating-point numbers expression together FloatAdd MacroAdds two Floating-point numbers together expression FloatSub MacroSubtracts two Floating-point numbers from expression each other FloatDivMacro Divides two Floating-point numbers procedure FloatSqrt Macro Findsthe square root of a Floating-point procedure number FloatToUInt MacroConverts a Floating-point number to an expression unsigned integerFloatToInt Macro Converts a Floating-point number to a signed expressioninteger FloatFromUInt Macro Converts an unsigned integer to a Floating-expression point number FloatFromInt Macro Converts a signed integer toa Floating-point expression number

[5476] Software Development for the Floating-Point Library

[5477] This section specifies in detail the performance and functionalspecification of the design. It also documents tests that can be used toverify that each macro functions correctly and that they integrate towork as one complete library.

[5478] The purpose of this design is to update an existing library toenable the user to perform arithmetic operations and integer to floatingpoint conversions on floating point numbers in Handel-C.

[5479] About the macros

[5480] Representation of a Floating Point Number.

[5481] A floating-point number is represented as a structure in themacros. The structure has three binary sections as to the IEEE 754specifications.

[5482] Sign bit (unsigned mnt x.Sign)

[5483] Exponent (unsigned mnt x.Exponent)

[5484] Mantissa (unsigned int x.Mantissa)

[5485] In the library the structure of a floating-point number, say x,may be as follows:

[5486] x = Ix.Sign, x.Exponent, x.Mantissa}

[5487] This represents the number:

[5488] (−1)^(x-Sign)*(1.(x.Mantissa))*2^((x Exponent-bias))

[5489] This expression can represent any decimal number within a rangerestricted by the exponent and mantissa width. Below is an example ofhow a floating-point number is defined. #include <Float.h> set clock =external “P1”; typedef FLOAT(4, 6) Float_4_6; void main() { Float_4_6 x;x = { 0 , 9 , 38 }; }

[5490] First a structure type is chosen by stating the widths of theexponent and mantissa. The exponent is chosen to be of width 4 and themantissa to be of width 6. This structure is named Float_(—)4_(—)6 and xis defined to be of this type.

[5491] x.Sign=0

[5492] This means that the number is positive.

[5493] x.Exponent=9

[5494] x.Exponent is unsigned but represents a signed number. To do thisthe exponent needs a correcting bias which is dependent on it's width.

[5495] Bias = 2^((Width of exponent−1))−1

[5496] In this case as the exponent width is 4 then the bias is(2³−1)=7. The number 9 therefore means the multiplying factor is2⁽⁹⁻⁷⁾=2²=4.

[5497] x.Mantissa=38

[5498] The mantissa represents the decimal places of the number. Asx.Mantissa=38=100110 then this represents the binary number 1.100110 inthe equation. In decimal this is 1.59375. The one added to this numberis known as a hidden 1.

[5499] The floating point number represented by {0,9,38} is:

[5500] (−1)⁰(1.59375)(4)=6.375

[5501] IEEE Width Specifications.

[5502] The widths of the exponent and mantissa have certain setspecifications.

[5503] IEEE 754 Single Precision

[5504] Exponent is 8 bits and has a bias of 127

[5505] Mantissa is 23 bits not including the hidden 1.

[5506] IEEE 754 Double Precision

[5507] Exponent is 11 bits and has a bias of 1023

[5508] Mantissa is 52 bits not including the hidden 1.

[5509] IEEE 754 Extended Precision

[5510] Exponent is 15 bits and has a bias of 32767

[5511] Mantissa is 64 bits not including the hidden 1.

[5512] The precision types can be requested by specifying these Exponentand Mantissa widths for the floating point number.

[5513] Valid Floating-Point Numbers.

[5514] For the purposes of this section a valid floating-point number isone of Exponent width less than 16 and Mantissa width less than 64. TheExponent and Mantissa are any bit pattern inside those widths whichincludes the special bit patterns. This library is tested up to thislevel.

[5515] Single Cycle Expressions.

[5516] Most of the library utilities are zero cycle macro expressionsand so use a single cycle when part of an assignment. They allow inputvariables of any width (up to a maximum mantissa width of 63). They mayhowever only be tested up to a precision which is 1 sign bit, 15exponent bits and 63 mantissa bits.

[5517] An example of a single cycle expression is the subtractionutility. This macro takes two floating-point numbers, f1 and f2 of thesame structure type.

[5518] result = FloatSub(f1, f2);

[5519] Result would then be a floating-point number with the samestructure type as f1 and f2.

[5520] Division and Square Root Macros.

[5521] The only utilities implemented as macro procedures (which are notsingle cycle expressions) are the division and square-root macros. Theseare called in a slightly different manner, with one of the inputparameters eventually holding the result value. For example, thedivision macro is defined as:

[5522] FloatDiv(N, D, Q);

[5523] The parameters for all these functions are:

[5524] N floating point numerator.

[5525] D floating point divisor.

[5526] Q floating point quotient (the result value).

[5527] N and D are unchanged after the macro is completed.

[5528] Special Values.

[5529] Special bit patterns are recognized in the library. These arereferred to as Not a Number (NaN) and infinity.

[5530] NaN

[5531] NaN is represented by all 1's in the exponent and any non-zeropattern in the mantissa. Following is an example of a single precisionNaN in binary.

[5532] x.Sign = 0

[5533] x.Exponent = 11111111

[5534] x.Mantissa = 00000000000000000000001

[5535] Infinity

[5536] Infinity is represented by all 1's in the exponent and all 0's inthe mantissa. This is the only way the single precision infinity can berepresented in binary.

[5537] x.Sign = 0

[5538] x.Exponent = 11111111

[5539] x.Mantissa = 00000000000000000000000

[5540] Output when Errors Occur.

[5541] When an error occurs in the calculation a special bit pattern isoutput as error messages. The bit pattern that is produced depends onthe situation. Several illustrative bit patterns are set forth below.Underflow is not strictly an error, but it is included below in Table 6for reference. TABLE 6 Problem Where problem number Problem occursOutput 1 Input Infinity Input Infinity 2 Overflow Result Infinity 3 x /0, x != 0 Input Infinity 4 Input NaN Input NaN (Mantissa : Same asinput) 5 0 * Infinity Input NaN (Mantissa : 1) 6 0 / 0 Input NaN(Mantissa : 2) 7 sqrt( x ), x < 0 Input NaN (Mantissa : 3) 8 Infinity +(−Infinity) Input NaN (Mantissa : 4) 9 Infinity / Infinity Input NaN(Mantissa : 5) 10 Underflow Result   0 11 sqrt(−0) Input −0

[5542] For each of the following macros all input and resultfloating-point numbers have the same structure type.

[5543] Structure

[5544] ID : Structure 1

[5545] Prototype: #define FLOAT(ExpWidth, MantWidth) float_Name

[5546] Description.

[5547] Defines a structure called float_Name with an unsigned integerpart called Sign (of width 1), unsigned integer part called Exponent (ofwidth ExpWidth) and unsigned integer part called Mantissa (with widthMantWidth). Note Table 7. TABLE 7 Parameters Description Range ExpWidthThe width of the exponent (1 . . . 15) MantWidth The width of themantissa (1 . . . 63)

[5548] Absolute Value.

[5549] ID : Function 1

[5550] Prototype : FloatAbs ( x )

[5551] Description.

[5552] Returns the absolute (positive) value of a floating point number.

[5553] Possible Error.

[5554] None. Note Table 8. TABLE 8 Parameters Description Range xFloating-point Number Any valid F.P. number

[5555] Negation.

[5556] ID : Function 2

[5557] Prototype : FloatNeg( x )

[5558] Description.

[5559] Returns the negated value of a floating point number.

[5560] Possible Error.

[5561] Negating zero returns a zero. Note Table 9. TABLE 9 ParametersDescription Range x Floating-point Number Any valid F.P. number

[5562] Left Shift.

[5563] ID : Function 3

[5564] Prototype : FloatLeftShift(x,v)

[5565] Description.

[5566] Shifts a floating-point number by v places to the left. Thismacro is equivalent to <<for integers.

[5567] Possible Error.

[5568] 1, 2 & 4.

[5569] Example.

[5570] Single precision representation of 6 left shifted by 4.

[5571] (−1)⁰(1+0.5)*2⁽¹²⁹⁻¹²⁷⁾<<4=(−1)⁰(1+0.5)*2⁽¹³³⁻¹²⁷⁾

[5572] The result is the representation of 96 or 6*2⁴. Note Table 10.TABLE 10 Parameters Description Range x Floating-point Number Any validF.P. number v Amount to shift by. Unsigned integer (0 . . . width(x))

[5573] Right Shift.

[5574] ID : Function 4

[5575] Prototype : FloatRightShift(x, v)

[5576] Description.

[5577] Shifts a floating-point number by v places to the right. Thismacro is equivalent to >> for integers.

[5578] Possible Error.

[5579] 1, 4& 10. NoteTable 11. TABLE 11 Parameters Description Range xFloating-point Number Any valid F.P. number v Amount to shift by.Unsigned integer (0 . . . width(x))

[5580] Nearest Rounding.

[5581] ID : Function 5

[5582] Prototype : FloatRound( x, MantWidth)

[5583] Description.

[5584] Rounds a floating-point number to have mantissa width Mantwidth.The value MantWidth may be less than the original mantissa width or elsethe macro won't compile.

[5585] Possible Errors.

[5586] 1 & 4. Note Table 12. TABLE 12 Parameters Description Range xFloating-point number of any Any valid F.P. number width MantWidthMantissa width of the result Unsigned integer (1 . . . 63)

[5587] Conversion Between Widths.

[5588] ID : Function 6

[5589] Prototype : FloatConvert(x, ExpWidth, MantWidth)

[5590] Description.

[5591] Converts a floating-point number to a float of exponent widthExpWidth and mantissa width MantWidth.

[5592] Possible Errors.

[5593] 1, 2 & 4. Note Table 13. TABLE 13 Parameters Description Range xFloating-point number of any Any valid F.P. number width ExpWidthExponent width of the result Unsigned integer (1 . . . 15) MantWidthMantissa width of the result Unsigned integer (1 . . . 63)

[5594] Multiplier.

[5595] ID : Function 7

[5596] Prototype : FloatMult(x1, x²)

[5597] Description.

[5598] Multiplies two floating point numbers of matching widths.

[5599] Possible Errors.

[5600] 1, 2, 4, 5 & 10. Note Table 14. TABLE 14 Parameters DescriptionRange x1, x2 Floating-point numbers Any valid F.P. number

[5601] Addition.

[5602] ID : Function 8

[5603] Prototype : FloatAdd(x1, x²)

[5604] Description.

[5605] Adds two floating point numbers of matching widths.

[5606] Possible Errors.

[5607] 1, 2, 4 & 8. Note Table 15. TABLE 15 Parameters Description Rangex1, x2 Floating-point numbers Any valid F.P. number

[5608] Subtraction.

[5609] ID: Function 9

[5610] Prototype : FloatSub(x1, x2)

[5611] Description.

[5612] Subtracts two floating-point numbers of matching widths (x1−x2).

[5613] Possible Errors.

[5614] 1, 2, 4 & 8. Table 16. TABLE 16 Parameters Description Range x1,x2 Floating-point numbers Any valid F.P. number

[5615] Division.

[5616] ID : Function 10

[5617] Prototype : FloatDiv(N, D, Q)

[5618] Description.

[5619] Divides two floating-point numbers of matching widths and outputsthe quotient. N/D=Q

[5620] Possible Errors.

[5621] 1, 2, 3, 4, 6, 9 & 10. Table 17. Parameters Description Range N,D Input floating-point numbers Any valid F.P. number Q Outputfloating-point number = Any valid F.P. number N / D

[5622] Square Root.

[5623] ID : Function 11

[5624] Prototype : FloatSqrt(R, Q)

[5625] Description.

[5626] Square roots a floating-point number. Sqrt(R) = Q

[5627] Possible Errors.

[5628] 1, 4, 7, 10 & 11. Table 18. TABLE 18 Parameters Description RangeR Input floating-point number Any valid F.P. number Q Outputfloating-point number = Any valid F.P. number Sqrt(R)

[5629] Floating Point to Unsigned Integer Conversion.

[5630] ID : Function 12

[5631] Prototype : FloatToUInt(x, wi)

[5632] Description.

[5633] Converts a floating-point number into an unsigned integer ofwidth wi using truncation rounding. If the number is negative a zero isreturned.

[5634] Possible Errors.

[5635] 1 & 4. Table 19. TABLE 19 Parameters Description Range xFloating-point number Any valid F.P. number wi Total width of the resultAny unsigned integer

[5636] Floating point to signed integer conversion.

[5637] ID : Function 13

[5638] Prototype : FloatToInt(x, wi)

[5639] Description.

[5640] Converts a floating point number into a signed integer of widthwi using truncation rounding.

[5641] Possible Errors.

[5642] 1 & 4. TABLE 20 Parameters Description Range x Floating-pointnumber Any valid F.P. number wi Total width of the result Any signedinteger

[5643] Unsigned Integer to Floating Point Conversion.

[5644] ID : Function 14

[5645] Prototype : FloatFromUInt(u, ExpWidth, MantWidth)

[5646] Description.

[5647] Converts an unsigned integer into a floating point number ofexponent width ExpWidth and mantissa width MantWidth using truncationrounding.

[5648] Possible Errors.

[5649] 2. See Table 21. TABLE 21 Parameters Description Range u Unsignedinteger Any unsigned integer ExpWidth Exponent width of the resultUnsigned integer (1 . . . 63) MantWidth Mantissa width of the resultUnsigned integer (1 . . . 15)

[5650] Signed Integer to Floating Point Conversion.

[5651] ID: Function 15

[5652] Prototype : FloatFromInt(i, ExpWidth, MantWidth)

[5653] Description.

[5654] Converts a signed integer into a floating point number ofexponent width ExpWidth and mantissa width MantWidth using truncationrounding.

[5655] Possible Errors.

[5656] 2. Note Table 22. TABLE 22 Parameters Description Range i IntegerAny integer ExpWidth Exponent width of the result Unsigned integer (1 .. . 63) MantWidth Mantissa width of the result Unsigned integer (1 . . .15)

[5657] Detailed Design

[5658] The following subsections describe design specifications forpracticing various embodiments of the present invention.

[5659] Interface Design

[5660] Structure 1—FLOAT(Exl2Width, MantWidth) Float Name

[5661] Description.

[5662] Defines a structure called Float_Name with an unsigned integerpart called Sign (of width 1), an unsigned integer part called Exponent(of width ExpWidth) and an unsigned integer part called Mantissa (withwidth MantWidth).

[5663] Valid Floating-Point Numbers.

[5664] For the purposes of this section, a valid floating-point numberis one of ExpWidth less than 16 and MantWidth less than 65. The Exponentand Mantissa are any bit pattern inside those widths including thespecial bit patterns. The library may be tested up to this level.

[5665] Input.

[5666] ExpWidth—The width of the exponent.

[5667] MantWidth—The width of the mantissa.

[5668] Output.

[5669] Format of the structure: struct { unsigned int 1 Sign; unsignedint ExpWidth Exponent; unsigned int MantWidth Mantissa; } float_Name;

[5670] Component Detail Design

[5671] Explanation of the Detailed Description.

[5672] If a variable isn't mentioned then it is the same on output asinput. For ease of understanding, the operations on each component haveeach been provided with a header.

[5673] Each macro tests if the input is infinity or NaN before it doesthe stated calculations. If the input is invalid the same floating-pointnumber is output. This can be done by: if Exponent = −1 { x = x } else {x = Calculation }

[5674] Some of the library macros call upon other macros unseen by theuser. These are listed in each section along with a brief description asto their use under the title “Dependencies”

[5675] Function 1—FloatAbs(x)

[5676] Description.

[5677] Returns the absolute (positive) value of a floating point number.

[5678] Input.

[5679] x—Floating point number of width up to {1, 15, 63}.

[5680] Output.

[5681] Floating point number of same width as input.

[5682] Detailed description.

[5683] Sign

[5684] x.Sign = 0.

[5685] Function 2—FloatNeg(x)

[5686] Description.

[5687] Returns the negated value of a floating point number.

[5688] Input.

[5689] x—Floating point number of width up to {1, 15, 63}.

[5690] Output.

[5691] Floating point number of same width as input.

[5692] Detailed Description.

[5693] Sign

[5694] if Exponent@Mantissa = 0.

[5695] {

[5696] x.Sign = 0, Exponent = 0, Mantissa = 0

[5697] }

[5698] else

[5699] {

[5700] x.Sign = !Sign

[5701] }

[5702] Function 3—FloatLeftShift(x,v)

[5703] Description.

[5704] Shifts a floating-point number by v places to the left. Thismacro is equivalent to << for integers.

[5705] Input.

[5706] x—Floating point number of width up to {1, 15, 63}.

[5707] v—Unsigned integer to shift by. This is not larger than ExpWidth.

[5708] Output.

[5709] Floating point number of same width as input.

[5710] Detailed Description. if Exponent + v > The maximum exponent forthe width { x = infinity } else { Exponent if x = 0 { x = x } else {x.Exponent = Exponent + v } }

[5711] Function 4—FloatRightShift(x, v)

[5712] Description.

[5713] Shifts a floating-point number by v places to the right. Thismacro is equivalent to >> for integers.

[5714] Input.

[5715] x—Floating point number of width up to {1, 15, 63}.

[5716] v—Unsigned integer to shift by. This is not larger than ExpWidth.

[5717] Output.

[5718] Floating point number of same width as input.

[5719] Detailed description. if Exponent − v < The minimum Exponent forthe width {  x = 0 } else { Exponent if x = 0 { x = x } else {x.Exponent = Exponent − v } }

[5720] Function 5—FloatRound(x, MantWidth)

[5721] Description.

[5722] Rounds a floating-point number to one with mantissa widthMantWidth.

[5723] Input.

[5724] x—Floating poi nt number of width up to ) 1, 15, 63}.

[5725] MantWidth—Round to unsigned mantissa width MantWidth.

[5726] Output.

[5727] Floating point number of same exponent width as input andmantissa width MantWidth.

[5728] Dependencies.

[5729] RoundUMant—extracts mantissa as an unsigned integer (with hidden1)

[5730] RoundRndMant—Rounds mantissa to MantWidth+2

[5731] Detailed description. Mantissa if the next least significant bitand any of the other less significant bits after the cut off point are 1{ x.Mantissa = The MantWidth most significant bits of Mantissa + 1 }else { x.Mantissa = The MantWidth most significant bits of Mantissa }Exponent if Mantissa overflows during rounding {  x.Exponent =Exponent + 1 } else {  x.Exponent = Exponent }

[5732] Function 6—FloatConvert(x, ExpWidth, MantWidth)

[5733] Description.

[5734] Converts a floating-point number to a float of exponent widthExpWidthi and mantissa width MantWidth.

[5735] Input.

[5736] x—Floating point number of width up to { 1, 15, 63}.

[5737] ExpWidth—Convert to unsigned exponent width ExpWidth.

[5738] MantWidth—Convert to unsigned mantissa width MantWidth.

[5739] Output.

[5740] Floating point number of exponent width ExpWidth and mantissawidth MantWidth.

[5741] Detailed Description. if (Exponent − old bias) > new bias {  x =infinity } else {  Exponent  x.Exponent = Exponent − old bias + new bias Mantissa  if new width is greater than old width  { x.Mantissa =Extended mantissa  } else { x.Mantissa = Most significant width bits  }}

[5742] Function 7—FloatMult(x1, x2)

[5743] Description.

[5744] Multiplies two floating point numbers.

[5745] Input.

[5746] x1, x2—Floating point numbers of width up to {1, 15, 63}

[5747] Output.

[5748] Floating point number of same width as input.

[5749] Dependencies.

[5750] MultUnderflowTest—Tests exponent for underflow.

[5751] MultOverflowTest—Tests exponent for overflow.

[5752] MultSign—Multipies the Signs.

[5753] GetDoubleMantissa—Pads the Mantissa with mantissa width zeros.

[5754] MantissaMultOverflow—Tests mantissa for overflow.

[5755] AddExponents—Adds exponents.

[5756] MultMantissa—Multiplies mantissa and selects the right bits.

[5757] Detailed Description. Test for exponent underflow if underflow istrue { x = 0 } else { Test for exponent overflow if overflow is true { x= Infinity } else { Sign x.Sign = x1.Sign or x2.Sign Exponent ifmantissa overflows { x.Exponent = x1.Exponent + x2.Exponent + 1 } else {x.Exponent = x1.Exponent + x2.Exponent } Mantissa Both mantissas arepadded below with zeros Mantissa = x1.Mantissa * x2.Mantissa x.Mantissa= top input width mantissa bits } }

[5758] Function 8—FloatAdd(x1, x2)

[5759] Description.

[5760] Adds two floating point numbers.

[5761] Input.

[5762] x1, x2—Floating point numbers of width up to {1, 15, 63}.

[5763] Output.

[5764] Floating point number of same width as input.

[5765] Dependencies.

[5766] SignedMant—Extracts mantissa as a signed integer.

[5767] MaxBiasedExp—determines the greater of two biased exponents.

[5768] BiasedExpDiff—Gets the difference between two exponents (to 64).

[5769] AddMant—Adds two mantissa.

[5770] GetBiasedExp—Gets biased exponent of the result.

[5771] GetAddMant—Gets the normalised mantissa of the result.

[5772] Detailed Description. Test for overflow if number overflows { x =infinity } else { Sign Adjust the mantissa to have same exponent Addthem x.Sign = Sign of the result Exponent if addition = 0 { x.Exponent =0 } else { x.Exponent = Max Exponent − Amount Mantissa adjusted by }Mantissa Adjust mantissa to have the same exponent  Mantissa =x1.Mantissa  + x2.Mantissa x.Mantissa = top width bits of mantissa }

[5773] Function 9—FloatSub(x1, x2)

[5774] Description.

[5775] Subtracts one float from another.

[5776] Input.

[5777] x1, x2—Floating point numbers of width up to {1, 15, 63}.

[5778] Output.

[5779] Floating point number (x1−x2) of same width as input.

[5780] Dependencies.

[5781] FloatNeg—Negates number.

[5782] FloatAdd—Adds two numbers.

[5783] Detailed Description.

[5784] x=FloatAdd(x1, −x2)

[5785] Function 10 EFloatDiv(N,D,Q)

[5786] Description.

[5787] Divides two floats and outputs the quotient. Q=N/D.

[5788] Input.

[5789] N, D, Q—Floating point numbers of width up to { 1, 15, 63}

[5790] Output.

[5791] None as it is a macro procedure.

[5792] Detailed Description.

[5793] This division macro is based on the non-restoring basic divisionscheme for signed numbers. This scheme has the following routine:

[5794] Set s=2*(1 concatenated to N.Mantissa)

[5795] Set d=2*(1 concatenated to D.mantissa)

[5796] Check to see if s is larger than d

[5797] If so set exponent adjust to zero

[5798] Else s=s/2 and set exponent adjust to one

[5799] Then do the following procedure mantissa width +1 times.

[5800] Check to see if first digit of (2*s)−d is 0

[5801] If so s=(2*s)−d, q=(2*q)+1

[5802] Else s=t2*s, q=2*q

[5803] The quotient Q is then

[5804] Q.Sign=N.Sign or D.Sign

[5805] Q.Exponent=N.Exponent−D.Exponent+the exponent adjust −1

[5806] Q.Mantissa=The least significant mantissa width bits of q

[5807] Worked Example—dividing 10 by −2.

[5808] 10=(1.25)*2^ 3={0, 0011, 01000}

[5809] −2=−(1.0)*2^ 1={1, 0001, 00000}

[5810] So

[5811] s=01010000

[5812] d=01000000

[5813] Is s larger than d? Yes so

[5814] s=00101000

[5815] adj_e=1

[5816] Iteration 1.

[5817] (2*s)d=01010000−01000000=00010000

[5818] The first digit is 0 so

[5819] s=00010000

[5820] q=1

[5821] Iteration 2.

[5822] (2*s)−d=00100000−01000000=10100000

[5823] The first digit is I so

[5824] s=00100000

[5825] q=10

[5826] Iteration 3.

[5827] (2*s)−d=01000000−01000000=0000000

[5828] The first digit is 0 so

[5829] s=00000000

[5830] q=101

[5831] Iteration 4.

[5832] (2*s)−d=00000000−01000000=11000000

[5833] The first digit is 1 so

[5834] s=00000000

[5835] q=1010

[5836] Iteration 5.

[5837] (2*s)−d=00000000−01000000=11000000

[5838] The first digit is 1 so

[5839] s=00000000

[5840] q=10100

[5841] The result is that q ends up as 10100000 after iteration 8.

[5842] The quotient Q is then:

[5843] Q.Sign=0 or 1=

[5844] Q.Exponent=N.Exponent−D.Exponent+adj_e−1=3−1+1−1=2

[5845] Q.Mantissa=01000

[5846] So Q is −5 as required. if D = 0 { Sign = D Sign Exponent = −1Mantissa = 1 } else { if N Exponent = −1 { Q = N } else { if D Exponent= −1 { Q = D } else { if N = 0 { s = 0 } else { s = ( 1 @ N Mantissa <<1 ) } d = ( 1 @ N Mantissa << 1 ) q = 0 i = 0 if most significant bit(s-d) == 0 { s = s >> 1 adj = 1 } else { adj = 0 } while i not equal towidth of mantissa + 1 { if most significant bit of ( s << 1 ) − d = 0 {s = ( s << 1 ) − d q = ( q << 1 ) + 1 } else { s = s << 1 q = q << 1 } }i = i + 1 Q Sign = N Sign or D Sign if q = 0 { Q Exponent = 0 } else { QExponent = N Exponent − D Exponent + adj + Bias − 1 } Q Mantissa =bottom width bits of q } } }

[5847] Function 11—FloatSqrt(R, Q)

[5848] Description.

[5849] Calculates the square root of the input. Q=Sqrt(R)

[5850] Input.

[5851] R, Q—Floating point numbers of width up to {1, 15, 63}.

[5852] Output.

[5853] None as it is a macro procedure.

[5854] Dependencies.

[5855] GetUnbiasedExp—Extracts unbiased exponent.

[5856] Detailed Description.

[5857] This square root macro is based on the restoring shift/subtractalgorithm. This scheme has the following routine:

[5858] Set q=1

[5859] Set i=0

[5860] Check to see if exponent positive

[5861] If so

[5862] Set e=R.Exponent/2

[5863] Set s=R.Mantissa

[5864] Else

[5865] Set e=R.Exponent−1

[5866] Set s=2*R.Mantissa+2^ (mantissa width)

[5867] Then do the following procedure mantissa width + 1 times.

[5868] Check to see if first digit of (2*s)−(4*q+1)*2^ (Mantissawidth−1−i) is 0

[5869] If so s=(2*s)−(4*q+1)*2^ (Mantissa width−1−i), q=(2*q)+1

[5870] Else s=2*s, q=2*q

[5871] The square root Q is then

[5872] Q.Sign=0

[5873] Q.Exponent=e+bias

[5874] Q.Mantissa= The least significant mantissa width bits of q

[5875] Worked Example—Square Rooting 36

[5876] 36=(1.125)*2^ 5={0, 0101, 00100}

[5877] So as exponent is odd

[5878] e=0010

[5879] s=2*mantissa+2^ 5=00001000+00100000=00101000

[5880] q=1

[5881] Iteration 1.

[5882] 01010000−(00000100+00000001)<<4=00000000

[5883] First digit is 0 so

[5884] s=00000000

[5885] q=11

[5886] Iteration 2.

[5887] 00000000−(00001100−00000001)<<3=10011000

[5888] First digit is 1 so

[5889] s=00000000

[5890] q=110

[5891] Iteration 3.

[5892] 00000000−(00011000−00000001)<<2=10011100

[5893] First digit is 1 so

[5894] s=00000000

[5895] q=1100

[5896] This continues until we have the answer

[5897] Q.Sign=0

[5898] Q.Exponent=2+ bias (in this case bias is 7)

[5899] Q.Mantissa=10000

[5900] So Q is the integer 6. if R Sign = 1 { Q Sign = R Sign Q Exponent= −1 Q Mantissa = 2 } else { if R Exponent = −1 { Q = R } else { ifunbiased exponent even { e =( Unbiased exponent) /2 s = R Mantissa }else { e = (Unbiased exponent − 1)/2 s = ( R Mantissa << 1 ) + e widthof Q } q = 1 i = 0 while i not equal to width Mantissa + 1 { c = ( (s <<1)−  ((4*q + 1) << width mantissa − 1 − i) ) if most significant bit ofc = 1 { s = c q = ( q << 1 ) + 1 } else { s = s << 1 q = q << 1 } i =i + 1 } if R not equal to 0 { Q Sign = 0 Q Exponent = e + bias QMantissa = top width bits of q } else { Q = 0 } } }

[5901] Function 12—FloatToUlnt(x, wi)

[5902] Description.

[5903] Converts a floating-point number into an unsigned integer ofwidth wi using truncation rounding. If the number is negative a zero isreturned.

[5904] Input.

[5905] x—Floating point number of width up to {i, 15, 63}

[5906] wi—unsigned width of unsigned integer

[5907] Output.

[5908] Unsigned integer of width wi.

[5909] Dependencies.

[5910] GetMant—Gets mantissa for conversion to integer

[5911] ToRoundInt—Rounds to nearest integer

[5912] MantissaToInt—Converts mantissa to integer

[5913] Detailed Description. if absolute value of float less than 0.5 orequal to 0 { Output 0  } else { Left shift mantissa by exponent placesRound to nearest integer Output (unsigned) integer }

[5914] Function 13—FloatTolnt(x, wi)

[5915] Description.

[5916] Converts a floating point number into a signed integer of widthwi using truncation rounding.

[5917] Input.

[5918] x—floating point number

[5919] wi—unsigned width of integer

[5920] Output.

[5921] Signed integer of width wi.

[5922] Dependencies.

[5923] GetMant—Gets mantissa for conversion to integer.

[5924] ToRoundInt—Rounds to nearest integer.

[5925] MantissaToInt—Converts mantissa to integer.

[5926] Detailed Description. if absolute value of float less than 0.5 orequal to 0 { Output 0 } else { Left shift mantissa by exponent placesRound to nearest integer if sign = 0 { Output integer } else { Output-integer } }

[5927] Function 14—FloatFromUInt(u, ExpWidth, MantWidth)

[5928] Description.

[5929] Converts an unsigned integer into a floating point number ofexponent width ExpWidth and mantissa width MantWidth using truncationrounding.

[5930] Input.

[5931] u—unsigned integer

[5932] ExpWidth—unsigned width of output exponent

[5933] MantWidth—unsigned width of output mantissa

[5934] Output.

[5935] Floating point number of exponent width ExpWidth and mantissawidth MantWidth.

[5936] Dependencies.

[5937] UIntToFloatExp—Gets signed integer to exponent

[5938] UIntToFloatNonnalised—Gets signed integer to mantissa

[5939] Detailed Description.

[5940] When finding the left most bit of u the least significant bit islabeled 0 and the label numbering increases as the bits become moresignificant. Sign Sign = most significant binary integer bit Exponent ifinteger = 0 { Exponent = 0  } else { Exponent = position of left mostbit+ bias } Mantissa if integer = 0 {  Mantissa = 0 } else {  if widthinteger < width mantissa  { Mantissa = integer << ( width mant −position of left most bit of u) } else }  Mantissa = integer << ( width integer− position of left most bit of u)  } }

[5941] Function 15—FloatFromInt(i, ExpWidth, MantWidth)

[5942] Description.

[5943] Converts a signed integer into a floating point number ofexponent width ExpWidth and mantissa width MantWidth using truncationrounding.

[5944] Input.

[5945] i—signed integer.

[5946] ExpWidth—unsigned width of output exponent

[5947] MantWidth—unsigned width of output mantissa

[5948] Output.

[5949] Floating point number of exponent width ExpWidth and mantissawidth MantWidth.

[5950] Dependencies.

[5951] IntToFloatExp—Gets unsigned integer to exponent

[5952] IntToFloatNormalised—Gets unsigned integer to mantissa

[5953] Detailed Description.

[5954] When finding the left most bit of u the least significant bit islabelled 0 and the label numbering increases as the bits become moresignificant. Sign Sign = most significant integer bit Exponent ifinteger = 0 { Exponent = 0 } else { Exponent = position of left mostbit+ bias } Mantissa integer = absolute value of integer if integer = 0{  Mantissa = 0 } else { if width integer < width mantissa { Mantissa =integer << ( width mant − left most bit of integer ) } else { Mantissa =integer << ( width integer − left most bit of integer ) } }

[5955] Verification

[5956] Testing method can be implemented with verification methods suchas Positive (Pos), Negative (Neg), Volume and Stress (Vol), Comparison(Comp) and Demonstration (Demo) tests.

[5957] Positive Testing

[5958] Valid floating point numbers are entered into the macro and theresult is compared to the correct answer.

[5959] Negative Testing

[5960] Invalid floating point numbers are entered into the macro and theresultant error is compared to the correct error.

[5961] Volume and Stress Testing

[5962] Valid floating point numbers are repeatedly entered into themacro to see that it works in a correct and repeatable manner.

[5963] Comparison Testing

[5964] Correct results are gained from a reliable source to compare themacro results to.

[5965] Demonstration Testing

[5966] Behavior in representative circumstances is evaluated.

[5967] Fixed Point Library

[5968] Software Development for the Fixed-Point Library

[5969] This section specifies in detail the performance and functionalspecification of the Fixed-Point Library design. It describes howrequirements for implementation of the library are to be met. It alsodocuments tests that are useful for verifying that each Handel-C and/orsoftware unit functions correctly and that they integrate to work as onecomplete application.

[5970] The Handel-C Fixed-point Library contains a number of functionsfor the creation and manipulation offlxed-point numbers. It consists ofa library (.lib) file, a header (.h) file and a function manual. Theheader prototypes the expressions available in the library.

[5971] The Handel-C Fixed-point Library is constrained to adopt thedesign philosophy of Handel-C where numerical operators require matchingtypes. Therefore the parameters of each function are of matching widthand type and the result returned may be of matching width and typeunless otherwise specified.

[5972] Number Structure

[5973] FIXED_SIGNED(intWidth,fracWidth)

[5974] This creates a structure to hold a signed fixed-point number.intWidth sets the number of integer bits and fracWidth sets the numberof fraction bits.

[5975] FIXED_UNSIGNED(intWidth,fracWidth)

[5976] This creates a structure to hold an unsigned fixed-point number.intWidth sets the number of integer bits and fracWidth sets the numberof fraction bits.

[5977] FIXED_ISSIGNED

[5978] Defined to equal 1.

[5979] FIXED_ISUNSIGNED

[5980] Defined to equal 0.

[5981] Bit Manipulation Operators

[5982] FixedLeftShift (fixed_Name, variable Shift)

[5983] Returns fixed_Name shifted left by variable_Shift number of bits.This produces a fixed-point number of the same type and width asfixed_Name with any bits shifted outside of its width being lost and anybits added being zero.

[5984] FixedRightShift(fixed_Name, variable-Shift)

[5985] Returns fixed_Name shifted right by variable_Shift number ofbits. This produces a fixed-point number of the same type as fixed_Namewith any bits shifted outside of its width being lost. When shiftingunsigned values the upper bits are padded with zeros. When shiftingsigned values, the upper bits are copies of the top bit of the originalvalue. So signed numbers are sign extended in the same way as theHandel-C shift right function.

[5986] Arithmetic Operators

[5987] Any attempt to perform one of these operations on two expressionsof differing widths or types may result in a compiler error.

[5988] FixedNeg(fixed_Name)

[5989] Returns the negative of the operand.

[5990] FixedAdd(fixed_Name1, fixed_Name2)

[5991] Returns the sum of the operands.

[5992] FixedSub(fixed_Name1, fixed_Name2)

[5993] Returnsfixed_Name2 subtracted from fixed_Name1.

[5994] FixedMultSigned(fixed_Name1, fixed_Name2)

[5995] Returns the product of the operands for signed numbers only.

[5996] FixedMultUnsigned(fixed_Name1, fixed_Name2)

[5997] Returns the product of the operands for unsigned numbers only.

[5998] FixedDivSigned(fixed_Name1, fixed_Name2)

[5999] Returns fixed_Name1 divided byfixed_Name2 for signed numbersonly.

[6000] FixedDivUnsigned(fixed_Name1, fixed_Name2)

[6001] Returns fixed_Name1 divided by fixed_Name2 for unsigned numbersonly.

[6002] FixedAbs(fixed_Name)

[6003] Returns the absolute value.

[6004] Relational Operators

[6005] These operators compare values of the same width and return asingle bit wide unsigned int value of 0 for false or 1 for true.

[6006] FixedEq(fixed_Name1, fixed_Name2)

[6007] Returns true if the operands are equal.

[6008] FixedNEq(fixed_Name1, fixed_Name2)

[6009] Returns true if the operands are not equal.

[6010] FixedLT(fixed_Name1, fixed_Name2)

[6011] Returns true if fixed_Name1 is less thanfixed_Name2.

[6012] FixedLTE(fixed_Name1, fixed_Name2)

[6013] Returns true if fixed_Name1 is less than or equal to fixed_Name2.

[6014] FixedGT(fixed_Name1, fixed_Name2)

[6015] Returns true if fixed_Name1 is greater than fixed_Name2.

[6016] FixedGTE(fixed_Name1, fixed_Name2)

[6017] Returns true if fixed_Name1 is greater than or equal tofixed_Name2.

[6018] Bitwise Logical Operators

[6019] These operators perform bitwise logical operations on fixed-pointnumbers. Both operands may be of the same type and width: the resultingvalue may also be this type and width.

[6020] FixedNot(fixed_Name)

[6021] Returns bitwise not.

[6022] FixedAnd(fixed_Name1, fixed_Name2)

[6023] Returns bitwise and.

[6024] FixedOr(fixed_Name1, fixed_Name2)

[6025] Returns bitwise or.

[6026] FixedXor(fixed_Name1, fixed_Name2)

[6027] Returns bitwise exclusive or.

[6028] Conversion Operators

[6029] These operators are for the type conversion of fixed numbers.

[6030] FixedIntWidth(fixed_Name)

[6031] Returns the width of the integer part of fixed_Name as a compiletime constant.

[6032] FixedFracWidth(fixed_Name)

[6033] Returns the width of the fraction part of fixed_Name as a compiletime constant.

[6034] FixedLiteral(isSigned, intWidth, frac Width, intBits,fracBits)

[6035] Returns a signed fixed-point number if isSigned is true and anunsigned fixed-point number if issigned is false. This number has aninteger part intBits of width intWidth and a fraction part fracBits ofwidthfracWidth.

[6036] FixedToInt(fixed_Name)

[6037] Returns the integer part of the fixed-point number with the sametype and width.

[6038] FixedToBool (fixed_Name)

[6039] Returns a single bit wide unsigned int value which is 0 for falseif the operand equals 0 and 1 for true otherwise.

[6040] FixedToBits(ixed_Name)

[6041] Returns the integer and fraction bits of fixed_Name concatenatedtogether. For a signed fixed-point number this may produce a signedinteger of width intWidth+fracWidth. For an unsigned

[6042] Fixed-point number this may produce an unsigned integer of widthintWidth + fracWidth.

[6043] FixedCastSigned (isSigned, intWidth, frac Width, fixed_Name)

[6044] Casts any signed fixed-point number to the type and widthspecified.

[6045] FixedCastUnsigned (issigned, intWidth, fracWidth, fixed Name)

[6046] Casts any unsigned fixed-point number to the type and widthspecified.

[6047] Design

[6048] This section may describe the present invention according to apreferred embodiment.

[6049] Interface

[6050] This library can be accessed via a standard header file includedin the client's programs by “#include <fixed.h>”.

[6051] Shared Resources

[6052] Although the internal macros may be used by more than one publicmacro there can be no sharing conflicts as they are not defined asshared expressions and as such Handel-C may generate all the hardwarerequired for every expression in the library every time it is used.

[6053] Note:

[6054] Handel-C arithmetic is used throughout the macros. This meansthat all operators return results of the same width as their operandsand all overflow bits are dropped. For example: #include “fixed.h” setclock = external “P1”; typedef FIXED_UNSIGNED(4,4) MyFixed; void main(void) { MyFixed fixed1, fixed2, fixed3; // Assign the value 5 to fixed1fixed1 = FixedLiteral (FIXED_ISUNSIGNED, 4, 4, 5, 0); // Assign thevalue 5.5 to fixed2 fixed2 = FixedLiteral (FIXED_ISSIGNED, 4, 4, 5, 8);// Multiply the numbers together fixed3 = FixedMultUnsigned (fixed1,fixed2); }

[6055] This example results in fixed3 being set to 11.5:

[6056] fixed3.FixedIntBits=11;

[6057] fixed3.FixedFracBits=8;

[6058] The user is responsible for handling overflows explicitly and canuse FixedCastSigned and FixedCastUnsigned to change the width of afixed-point number.

[6059] Number Structure

[6060] The Handel-C data types used are signed and unsigned fixed-pointnumbers of user defined widths. The structures of the signed andunsigned fixed-point numbers are below. The widths of these fixed-pointnumbers are declared by the user. All the operations necessary to set,manipulate and extract the values of fixed_Name.FixedIntBits andfixed_Name.FixedFracBits are available in the library.

[6061] COMP 1.1 FIXED_SIGNED(intWidth, fracWidth)

[6062] Description

[6063] This creates a structure to hold a signed fixed-point number.intWidth sets the number of integer bits andfracWidth sets the number offraction bits. Inputs intWidth     width of the integer part of thenumber fracWidth width of the fraction part of the number

[6064] COMP 1.2 FIXED_UNSIGNED(intWidth, frac Width)

[6065] Description

[6066] This creates a structure to hold an unsigned fixed-point number.intWidth sets the number of integer bits and fracWidth sets the numberof fraction bits. Inputs intWidth     width of the integer part of thenumber fracWidth width of the fraction part of the number

[6067] Output

[6068] Format of the structure:

[6069] struct

[6070] {

[6071] unsigned intWidth FixedIntBits;

[6072] unsigned fracWidth FixedFracBits;

[6073] }

[6074] COMP 1.3 FIXED_ISSIGNED

[6075] Description

[6076] Defined to equal 1.

[6077] Inputs

[6078] None

[6079] Output

[6080] None

[6081] COMP 1.3 FIXED_ISUNSIGNED

[6082] Description

[6083] Defined to equal 0.

[6084] Inputs

[6085] None

[6086] Output

[6087] None

[6088] Bit Manipulation Operators

[6089] COMP 2.1 FixedLeftShift (fixed_Name, variable_Shift)

[6090] Description

[6091] Returns fixed_Name shifted left by variable_Shift number of bits.This produces a fixed-point number of the same type and width asfixed_Name with any bits shifted outside of its width being lost andlower bits padded with zeros. Inputs fixed_Name Fixed-point number ofany type or width variable_Shift  Unsigned integer number of bits toshift by.  Width set by:width(variable_Shift)=log2ceil(fracWidth+intWidth+1)

[6092] Output

[6093] Fixed-point number of same type and width as fixed_Name

[6094] Detailed Description

[6095] Concatenate integer and fraction parts from fixed_Name into asingle bit string

[6096] Shift fixed_Name left by int_value number of bits

[6097] Split result into integer and fraction parts of same type andwidth as fixed_Name

[6098] Return as struct

[6099] COMP 2.2 FixedRightShift(fixed_Name, variable_Shift)

[6100] Description

[6101] Returns fixed_Name shifted right by variable_Shift number ofbits. This produces a fixed-point number of the same type as fixed_Namewith any bits shifted outside of its width being lost. When shiftingunsigned values the upper bits are padded with zeros. When shiftingsigned values, the upper bits are copies of the top bit of the originalvalue. So signed numbers are sign extended in the same way as theHandel-C shift right function. Inputs fixed_Name      Fixed-point numberof any type or width variable_Shift Unsigned integer number of bits toshift by. Width set by:width(variable_Shift)=log2ceil(fracWidth+intWidth+1)

[6102] Output

[6103] Fixed-point number of same type and width as fixed_Name

[6104] Detailed Description

[6105] Concatenate integer and fraction parts from fixed Name into asingle bit string

[6106] Shift fixed_Name right by int_value number of bits

[6107] Split result into integer and fraction parts of same type andwidth as fixed_Name

[6108] Return as struct

[6109] Arithmetic Operators

[6110] COMP 3.1 FixedNeg(fixed_Name)

[6111] Description

[6112] Returns the negative of fixed_Name. The result of using thismacro on an unsigned fixed-point structure is undefined. Inputsfixed_Name Fixed-point number of any type and width

[6113] Output

[6114] Fixed-point number of same type and width as fixed_Name

[6115] Detailed Description

[6116] Concatenate integer and fraction parts from fixed_Name into asingle bit string

[6117] Negate the bit string

[6118] Split result into integer and fraction parts of same type andwidth as fixed_Name

[6119] Return as struct

[6120] COMP 3.2 FixedAdd(fixed_Name1, fixed_Name2)

[6121] Description

[6122] Returns the fixed_Name1 and fixed_Name2 added together. Thenumber returned is of the same width as the operands so any bitsproduced by the addition outside of this width overflow and are dropped.Inputs fixed_Name1 Fixed-point number of any type or width fixed_Name2Fixed-point number of the same type and width

[6123] Output

[6124] Fixed-point number of same type and width as fixed_Name1

[6125] Detailed Description

[6126] At compile time check the operands are of the same width and ifnot give an assertion error

[6127] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string

[6128] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string

[6129] Add the bit strings and drop any overflow bits

[6130] Split result into integer and fraction parts of same type andwidth as fixed_Name1

[6131] Return as struct

[6132] COMP 3.3 FixedSub(fixed_Name1, fixed_Name2)

[6133] Description

[6134] Returns fixed_Name2 subtracted from fixed_Name1. The numberreturned is of the same width as the operands so any bits produced bythe subtraction outside of this width overflow and are lost. Inputsfixed_Name1 Fixed-point number of any type or width fixed_Name2Fixed-point number of the same type and width

[6135] Output

[6136] Fixed-point number of same type and width as fixed_Name1

[6137] Detailed Description

[6138] At compile time check the operands are of the same width and ifnot give an assertion error

[6139] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string

[6140] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string

[6141] Subtract the bit strings and drop any overflow bits

[6142] Split result into integer and fraction parts of same type andwidth as fixed_Name1

[6143] Return as struct

[6144] COMP 3.4 FixedMultSigned(fixed_Name1, fixed_Name2)

[6145] Description

[6146] Returns the product of the operands for signed numbers only. Thenumber returned is of the same width as the operands so any bitsproduced by the multiplication outside of this width overflow and arelost. Inputs fixed_Name1 Signed fixed-point number of any widthfixed_Name2 Signed fixed-point number of the same width

[6147] Output

[6148] Signed fixed-point number of same width as fixed_Name1

[6149] Detailed Description

[6150] At compile time check fixed_Name1 and fixed_Name2 are of the samewidth and signed type

[6151] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string and sign extend the string by the width of thefraction part of fixed_Name1

[6152] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string and sign extend the string by the width of thefraction part of fixed_Name2

[6153] Multiply these bit strings together

[6154] Drop the fracWidth least significant bits of the result AcSplitresult into integer and fraction parts of the same type and width asfixed_Name1

[6155] Return as struct

[6156] COMP 3.5 FixedMultUnsigned(fixed_Name1, fixed_Name2)

[6157] Description

[6158] Returns the product of the operands for unsigned numbers only.The number returned is of the same width as the operands so any bitsproduced by the multiplication outside of this width overflow and arelost. Inputs fixed_Name1 Unsigned fixed-point number of any widthfixed_Name2 Unsigned fixed-point number of the same width

[6159] Output

[6160] Unsigned fixed-point number of same width as fixed_Name1

[6161] Detailed Description

[6162] At compile time check fixed_Name1 and fixed_Name2 are of samewidth and unsigned type

[6163] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string and extend the string with zeros by the width of thefraction part off fixed_Name1

[6164] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string and extend the string with zeros by the width of thefraction part of fixed_Name2

[6165] Multiply these bit strings together

[6166] Drop the fracWidth least significant bits of the result

[6167] Split the result into integer and fraction parts of the same typeand width as fixed_Name1

[6168] Return as struct

[6169] COMP 3.6 FixedDivSigned(fixed_Name1, fixed_Name2)

[6170] Description

[6171] Returns fixed_Name1 divided by fixed_Name2 for signed numbersonly. The result for fixed_Name2=0 is undefined. The number returned isof the same width as the operands so any bits produced by the divisionoutside of this width are lost. Inputs fixed_Name1     Signedfixed-point number of any width fixed_Name2 Signed fixed-point number ofthe same width not equal to zero

[6172] Output

[6173] Signed fixed-point number of same width as fixed_Name1

[6174] Detailed Description

[6175] At compile time check fixed_Name1 and fixed_Name2 are of the samewidth and signed type

[6176] Concatenate together the integer and fraction parts offixed_Name1, and zero with the same width as the fraction part, into asingle bit string

[6177] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string and sign extend the string by the width of thefraction part of fixed_Name2

[6178] Divide the first bit string by the second

[6179] Take the least significant bits of the result to make it the samelength as the divided bit string

[6180] Split result of func into integer and fraction parts of same typeand width as fixed_Name1

[6181] Return as struct

[6182] COMP 3.7 FixedDivUnsigned(fixed_Name1, fixed_Name2)

[6183] Description

[6184] Returns fixed_Name1 divided by fixed_Name2 for unsigned numbersonly. The result for fixed_Name2=0 is undefined. The number returned isof the same width as the operands so any bits produced by the divisionoutside of this width are lost. Inputs fixed_Name1 Unsigned fixed-pointnumber of any width fixed_Name2  Unsigned fixed-point number of the samewidth.

[6185] Output

[6186] Unsigned fixed-point number of same type and width as fixed_Name1

[6187] Detailed Description

[6188] At compile time check fixed_Name1 and fixed_Name2 are of the samewidth and unsigned type

[6189] Concatenate together the integer and fraction parts offixed_Name1, and zero with the same width as the fraction part, into asingle bit string

[6190] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string and extend the string by the width of the fractionpart of fixed_Name2

[6191] Divide the first bit string by the second

[6192] Take the least significant bits of the result to make it the samelength as the divided bit string

[6193] Split result of func into integer and fraction parts of same typeand width as fixed_Name1

[6194] Return as struct

[6195] COMP 3.8 FixedAbs_fixed_Name)

[6196] Description

[6197] Returns the absolute value. The result of using this macro on anunsigned fixed-point structure is undefined. Signed integers use 2'scomplement representation in Handel-C so

[6198] abs(max positive number)<abs(min negative number)

[6199] This means the function gives the result:

[6200] abs(min negative number)=min negative number. Inputs fixed_NameFixed-point number of any type and width

[6201] Output

[6202] Fixed-point number of same type and width as fixed_Name

[6203] Detailed Description

[6204] Concatenate integer and fraction parts from fixed_Name into asingle bit string

[6205] Find the absolute value of the bit string

[6206] Split result into integer and fraction parts of same type andwidth as fixed_Name

[6207] Return as struct

[6208] Relational Operators

[6209] The macros in this section rely on Handel-C's type and widthchecking.

[6210] COMP 4.1 FixedEq(fixed_Name1, fixed_Name2)

[6211] Description

[6212] Returns true if the operands are equal. Inputs fixed_Name1Fixed-point number of any type or width fixed_Name2 Fixed-point numberof the same type and width

[6213] Output

[6214] Single bit wide unsigned integer value with 0 as false and 1 astrue

[6215] Detailed Description

[6216] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string

[6217] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string

[6218] True if the bit strings are equal

[6219] Return result

[6220] COMP 4.2 FixedNEq(fixed_Name1, fixed_Name2)

[6221] Description

[6222] Returns true if the operands are not equal Inputs fixed_Name1Fixed-point number of any type or width fixed_Name2 Fixed-point numberof the same type and width

[6223] Output

[6224] Single bit wide unsigned integer value with 0 as false and 1 astrue

[6225] Detailed Description

[6226] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string

[6227] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string

[6228] True if the bit strings are not equal

[6229] Return result

[6230] COMP 4.3 FixedLT(fixed_Name1, fixed_Name2)

[6231] Description

[6232] Returns true if fixed_Name1 is less than fixed_Name2. Inputsfixed_Name1 Fixed-point number of any type or width fixed_Name2Fixed-point number of the same type and width

[6233] Output

[6234] Single bit wide unsigned integer value with 0 as false and 1 astrue

[6235] Detailed Description

[6236] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string

[6237] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string

[6238] True if the first bit string is less than the second

[6239] Return result

[6240] COMP 4.4 FixedLTE(fixed_Name1, fixed_Name2)

[6241] Description

[6242] Returns true if fixed_Name1 is less than or equal to fixed_Name2.Inputs fixed_Name1 Fixed-point number of any type or width fixed_Name2Fixed-point number of the same type and width

[6243] Output

[6244] Single bit wide unsigned integer value with 0 as false and 1 astrue

[6245] Detailed Description

[6246] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string

[6247] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string

[6248] True if the first bit string is less than or equal to the second

[6249] Return result

[6250] COMP 4.5 FixedGT(fixed_Name1, fixed_Name2)

[6251] Description

[6252] Returns true if fixed_Name1 is greater than fixed_Name2. Inputsfixed_Name1 Fixed-point number of any type or width fixed_Name2Fixed-point number of the same type and width

[6253] Output

[6254] Single bit wide unsigned integer value with 0 as false and 1 astrue

[6255] Detailed Description

[6256] Return the result of FixedLT(fixed_Name2,fixed—Name1)

[6257] COMP 4.6 FixedGTE(fixed_Name1, fixed_Name2)

[6258] Description

[6259] Returns true if fixed_Name1 is greater than or equal tofixed_Name2. Inputs fixed_Name1 Fixed-point number of any type or widthfixed_Name2 Fixed-point number of the same type and width

[6260] Output

[6261] Single bit wide unsigned integer value with 0 as false and 1 astrue

[6262] Detailed Description

[6263] Return the result of FixedLTE(fixed_Name2, fixed_Name1)

[6264] Bitwise Logical Operators

[6265] The macros in this section rely on Handel-C's type and widthchecking.

[6266] COMP 5.1 FixedNot(fixed_Name)

[6267] Description

[6268] Returns bitwise not. Inputs fixed_Name Fixed-point number of anytype or width

[6269] Output

[6270] Fixed-point number of same type and width as fixed_Name

[6271] Detailed Description

[6272] Concatenate integer and fraction parts from fixed_Name into asingle bit string

[6273] Find the bitwise not of the bit string

[6274] Split result into integer and fraction parts of same type andwidth as fixed_Name

[6275] Return as struct

[6276] COMP 5.2 FixedAnd(fixed_Name1, fixed_Name2)

[6277] Description

[6278] Returns bitwise and. Inputs fixed_Name1 Fixed-point number of anytype or width fixed_Name2 Fixed-point number of the same type and width

[6279] Output

[6280] Fixed-point number of same type and width as fixed_Name1

[6281] Detailed Description

[6282] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string

[6283] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string

[6284] Find the bitwise and of the bit strings

[6285] Split result into integer and fraction parts of same type andwidth as fixed_Name1

[6286] Return as struct

[6287] COMP 5.3 FixedOr(fixed_Name1, fixed_Name2)

[6288] Description

[6289] Returns bitwise or. Inputs fixed_Name1 Fixed-point number of anytype or width fixed_Name2 Fixed-point number of the same type and width

[6290] Output

[6291] Fixed-point number of same type and width as fixed_Name1

[6292] Detailed Description

[6293] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string

[6294] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string

[6295] Find the bitwise or of the bit strings

[6296] Split result into integer and fraction parts of same type andwidth as fixed_Name1

[6297] Return as struct

[6298] COMP 5.4 FixedXor(fixed_Name1, fixed_Name2)

[6299] Description

[6300] Returns bitwise xor. Inputs fixed_Name1 Fixed-point number of anytype or width fixed_Name2 Fixed-point number of the same type and width

[6301] Output

[6302] Fixed-point number of same type and width as fixed_Name1

[6303] Detailed Description

[6304] Concatenate integer and fraction parts from fixed_Name1 into asingle bit string

[6305] Concatenate integer and fraction parts from fixed_Name2 into asingle bit string

[6306] Find the bitwise xor of the bit strings

[6307] Split result into integer and fraction parts of same type andwidth as fixed_Name1

[6308] Return as struct

[6309] Conversion Operators

[6310] COMP 6.1 FixedIntWidth(fixed_Name)

[6311] Description

[6312] Returns the width of the integer part of fixed_Name as a compiletime constant. Inputs fixed_Name1 Fixed-point number of any type orwidth

[6313] Output

[6314] Compile time constant integer

[6315] Detailed Description

[6316] Return the width of the integer part of fixed_Name

[6317] COMP 6.2 FixedFracWidth(fixed_Name)

[6318] Description

[6319] Returns the width of the fraction part of fixed_Name as a compiletime constant. Inputs fixed_Name1 Fixed-point number of any type orwidth

[6320] Output

[6321] Compile time constant integer

[6322] Detailed Description

[6323] Return the width of the fraction part of fixed_Name

[6324] COMP 6.3 FixedLiteral(isSigned, intWidth, fracWidth, intBits,fracBits)

[6325] Description

[6326] Returns a signed fixed-point number if isSigned is true and anunsigned fixed-point number if isSigned is false. This number has aninteger part intBits of width intWidth and a fraction part fracBits ofwidthfracWidth. Inputs isSigned Compile time constant to indicate thetype of fixed-point structure. FIXED_ISSIGNED represents signed andFIXED_ISUNSIGNED unsigned intWidth     Compile time constant integer toset width of     integer part fracWidth Compile time constant integer toset width of fraction part intBits Value to set integer part fracBits    Value to set fraction part

[6327] Output

[6328] Signed or unsigned fixed-point number with widths and valuesspecified

[6329] Detailed Description

[6330] Selects signed or unsigned type to cast structure using isSigned

[6331] Return a fixed-point number with an integer part of widthintWidth and value intBits, and a fraction part of widthfrac Width andvaluefracBits

[6332] COMP 6.4 FixedToInt(fixed_Name)

[6333] Description

[6334] Returns the integer part of the fixed point number with the sametype and width. Inputs fixed_Name Fixed point number of any type orwidth

[6335] Output

[6336] Integer of same type and width as the integer part of the numberis stored in the fixed point structure

[6337] Detailed Description

[6338] Return the integer part of the fixed point number

[6339] COMP 6.5 FixedToBool (fixed_Name)

[6340] Description

[6341] Returns a single bit wide unsigned int value which is 0 for falseif the operand equals 0 and 1 for true otherwise. Inputs fixed_NameFixed-point number of any type or width

[6342] Output

[6343] Single bit wide unsigned integer value with 0 as false and 1 astrue

[6344] Detailed Description

[6345] Return 1 if the fixed and the fraction parts of fixed_Name areboth not equal to zero and 0 otherwise

[6346] COMP 6.6 FixedToBits(fixed_Name)

[6347] Description

[6348] Returns the integer and fraction bits of fixed_Name concatenatedtogether. Inputs fixed_Name Fixed-point number of any type or width

[6349] Output

[6350] Integer of same type as the fixed-point structure and with widthintWidth+fracWidth

[6351] Detailed Description

[6352] Return the integer part and the fraction part of the fixed-pointnumber concatenated together

[6353] COMP 6.7 FixedCastSigned(isSigned, intWidth, fracWidth,fixed_Name)

[6354] Description

[6355] Casts any signed fixed-point number to the type and widthsspecified. Inputs isSigned Compile time constant to indicate the type offixed-point structure. FIXED_ISSIGNED represents signed andFIXED_ISUNSIGNED unsigned intWidth Width to cast the integer part of thenumber to fracWidth Width to cast the fraction part of the number tofixed_Name Fixed-point number of signed type and any width

[6356] Output

[6357] Fixed-point number of the type specified

[6358] Detailed Description

[6359] Adjust the integer part of fixed_Name to a width of intWidth byeither taking the intWidth least significant bits or sign extending.

[6360] Adjust the fraction part of fixed_Name to a width of fracWidth byeither taking the fracWidth most significant bits or adding bits withvalue zero in after the number.

[6361] If isSigned is true then cast the integer and fraction parts ofthe floating point number as signed

[6362] If isSigned is false then cast the integer and fraction parts ofthe floating point number as unsigned

[6363] Return the result as a struct

[6364] COMP 6.8 FixedCastUnsigned(isSigned, intWidth, fracWidth,fixed_Name)

[6365] Description

[6366] Casts any unsigned fixed-point number to the type and widthsspecified. Inputs isSigned Compile time constant to indicate the type offixed-point structure. FIXED_ISSIGNED represents signed andFIXED_ISUNSIGNED unsigned intWidth Width to cast the integer part of thenumber to fracWidth Width to cast the fraction part of the number tofixed_Name Fixed-point number of unsigned type and any width

[6367] Output

[6368] Fixed-point number of the type specified

[6369] Detailed Description

[6370] Adjust the integer part of fixed_Name to a width of intWidth byeither taking the intWidth least significant bits or adding bits withvalue zero in front of the number.

[6371] Adjust the fraction part of fixed_Name to a width of fracWidth byeither taking the fracWidth most significant bits or adding bits withvalue zero in after the number.

[6372] If isSigned is true then cast the integer and fraction parts ofthe floating point number as signed

[6373] If isSigned is false then cast the integer and fraction parts ofthe floating point number as unsigned

[6374] Return the result as a struct

[6375] Verification

[6376] This section documents all of the tests necessary to verify thateach macro functions correctly. It is important that the macros matchtheir definitions in the subsections above. All of the macros availableto the user can be tested for results and errors using black boxtesting.

[6377] Runtime tests:

[6378] The type perfonmed are:

[6379] Positive (P)

[6380] Negative (N)

[6381] Volume and Stress (V&S)

[6382] Comparison (C)

[6383] Demonstration (D)

[6384] The tests are performed only on 8 and 32 bit numbers apart fromwhen it seems appropriate to use other widths also to fully test themacro, such as for FixedIntWidth. The tests are performed on signed andunsigned numbers apart from when this is not possible because the macrois only designed for one type. Generally the tests are aimed at:

[6385] Zero values (P, V&S, C, D)

[6386] Midrange values (P, V&S, C, D)

[6387] Overflow values (N, V&S, C, D)

[6388] The results expected for the comparison tests have beencalculated using the Microsoft Calculator.

[6389] Error Tests:

[6390] These are tests which may produce non-severe errors from thecompiler. They should be either standard Handel-C error messagesdirected at the functions used or assert errors defined in the library.Generally the tests performned may be:

[6391] inputting an integer into where there should be a fixed-pointstructure

[6392] inputting a signed fixed-point structures where there should bean unsigned fixed-point structure and vice versa

[6393] inputting two fixed-point structures of different types or widthin the same macro

[6394] inputting an integer of incorrect width

[6395] inputting variables when a constant is required

[6396] assigning fixed-point result to a fixed-point structure ofincorrect type or width

[6397] assigning integer result to a int of incorrect type or width

[6398] Performance Tests:

[6399] All of the macros take just one clock cycle to run. In the caseof the arithmetic operators, the number of SLICEs and maximum speed maybe calculated to compare with the appropriate Handel-C operator.

Waveform Analsis

[6400] Trace/Pattern Window

[6401]FIG. 93 illustrates a Trace and Pattern window 9300. In the Traceand Pattern window, the top half 9302 of the window shows the trace orpattern details. The bottom half 9304 of the window shows the values andpositions of marks that have been set on the trace or pattern. The marksare referred to as cursors and represented by colored triangles.

[6402] In an illustrative embodiment, the current trace is out-linedwith a green dashed line. The current cursor has a red underline.Right-clicking the trace waveform or the current value pane calls up amenu of possible display formats for that pane. Multiple traces orpatterns in a single window, but they all use the same cursors, the samenumber of points and the same clock period.

[6403] Zooming

[6404] A user may zoom in and out of the active Trace or Pattern windowusing the zoom icons or the Zoom options from the View menu.

[6405] Set Advance Step Dialog

[6406] The Set Advance Step dialog (Capture>Set Advance Step) specifiesthe time in nanoseconds to advance all simulations by.

[6407] Capture Menu

[6408] Several items of a capture menu according to an embodiment of thepresent invention include the following set forth in Table 23. TABLE 23Run (F5) Start reading traces from simulations and sending patterns tosimulations. Pause Temporarily stop sending traces to simulations andreading patterns from simulations. This may also suspend all connectedsimulations. Stop Stop reading traces from simulations and sendingpatterns (Shift+F5) to simulations. Simulations may continue runningafter Waveform Analyzer has stopped. Advance Advance all simulations bythe specified interval. (Ctrl+F11) Set Advance Specify the interval bywhich to advance simulations Step when ‘Advance’ is selected. Opens SetAdvance Step dialog.

[6409] Define Symbols Dialog Box

[6410] The Define symbols dialog box consists of a set of radio buttonswhich allow selection of how values are represented:

[6411] Binary

[6412] Octal

[6413] Decimal

[6414] Hexadecimal numbers

[6415] ASCII characters

[6416] User defined strings: The user may supply the filename of a filewhich associates symbols with values for the trace being defined. Eachline of this file should contain a number (in binary, octal, decimal orhexadecimal, using the Handel-C syntax) followed by a symbol. The symbolshould be separated from the number using a whitespace. Any values whichmay appear in the Trace and which do not have symbols associated withthem may be represented using the ‘?’ character. For example, if thetrace is of width 3, is unsigned, and the user defined symbol filecontains the following: 0b001 A 0b111 D 0b110 C 0b101 B

[6417] the values 1, 5, 6 and 7 may be represented as A, B, C and Drespectively. The values 0, 2, 3 and 4 may all be represented asquestion marks.

[6418] Edit Menu

[6419] Items in the Edit Menu include those set forth in Table 24. TABLE24 Find (Ctrl+F) Search for a specified sequence of data words in theselected trace or patten. The user is prompted for a PGL statementdescribing the sequence of words to search for, the search direction,and whether to scroll to the sequence if it is found. Searching startsat the position of the selected cursor. If there is no cursor, searchingstarts at the beginning of the selected trace or pattern. If thesequence is found, the selected cursor is positioned at the start of thesequence. If there is no cursor, a cursor is created at the start of thesequence. Copy (Ctrl+C) Copy the selected portion of the selected traceor pattern to the clipboard. Paste (Ctrl+V) Paste the contents of theclipboard into the selected portion of the selected pattern. SaveSelection Save the selected portion of the selected trace or patternAs... to a file. The user is prompted for a filename and, if the file isa VCD file, a reference name to use for the signal in the VCD file.

[6420] File Menu

[6421] Items in the Edit Menu include those set forth in Table 25. TABLE25 Open a new trace or pattern dialog. The user is prompted for the New(Ctrl+N) type of window, a filename for the window and the clock periodand number of points for the window. The clock period and the number ofpoints that the user specifies may be used for all traces or patterns inthe window. Open (Ctrl+O) Open an existing trace or pattern file. CloseClose the active trace or pattern window. Save (Ctrl+S) Save the activetrace or pattern window. Save As Save the active trace or pattern windowwith a different name. Save All Save all open trace and pattern windows.New Project Create a new project. Open Project Open an existing project.Close Project Close the current project. Save Project Save the currentproject. Print Print the active trace or pattern window. Print SetupSetup the printer details. Print Preview Preview the active trace orpattern window. Recent Files A list of recently used trace or patternfiles. Recent Projects A list of recently used projects. Exit Close allwindows and exit the application.

[6422] New Window Dialog

[6423] The New window dialog box (File>New) defines the default clockperiod and the number of points in the window. Elements include thoseset forth in Table 26. TABLE 26 Untitled box Enter the window nameDefault clock Enter the default clock period in nanoseconds periodDefault No. Enter the number of points recorded in the window pointsFilename and File where the window details are stored (use the locationbrowse button to choose a directory

[6424] Pattern menu

[6425] Items in the Edite Menu include those set forth in Table 27.TABLE 27 New Pattern Create a new pattern in the active pattern window.Edit Pattern Edit the selected pattern in the active pattern window.Delete Pattern Remove the selected pattern from the active patternwindow.

[6426] Pattern Properties Dialog Box

[6427] The fields in the Pattern properties dialog box include those setforth in Table 28. TABLE 28 Name Name to use for the pattern. The nameis displayed in a box on the left of the Pattern window. The name may bea C-style identifier. Width Width of the data in the pattern in bits.Type Whether the pattern represents signed or unsigned data. PointsNumber of points in the pattern. This value was entered when the PatternWindow was created. It cannot be edited. Clock Period Rate at which datais read into the pattern. This value was entered when the Pattern windowwas created. It cannot be edited. Source The source for the pattern maybe either a file or a script. Supported file formats are ASCII and VCD.The box to the right of the radio buttons is used to enter a script ifthe ‘Script’ radio button is checked, or a file name if the ‘File’ radiobutton is checked. Variable If the source is a VCD file, this box shouldbe used to enter the reference name of the variable in the VCD file thatmay be used as the source for this pattern. Destination Expression ofthe form ‘Terminal-Name(width)’ as for the DK1 Connect plugin TriggerTransmission of a pattern can be triggered by the occurrence of aspecified sequence of words in any trace. This box is used to specifywhich sequences of words and which trace triggering should occur on Ifthis box is empty, no trigger is used and all further trigger optionsare grayed out. Delay Specifies the trigger delay. For patterns, thismay be positive. A delay of x means that transmission begins x timeunits after a trigger sequence occurs. No Choose the trigger mode.trigger/Single/ No trigger, triggering is disabled. Auto Single, apattern is transmitted once after a trigger sequence occurs. Auto, apattern is transmitted after every occurrence of a trigger sequence.Pause on If this checkbox is ticked, capturing may Trigger automaticallyget paused after a trigger sequence has occurred and a pattern has beentransmitted. Interpolated This set of radio buttons is used to choosethe Waveform/ display format for the pattern. Stepped Waveform/ NumericSymbolic Define Select how values are represented Symbols (gives dialog)

[6428] Grouping Windows into Projects

[6429] Trace Windows and Pattern Windows can be grouped together intoprojects. Only one project may be open at a time. The user may create aproject if he or she wants to use a Pattern Generation Language Scriptfile.

[6430] Creating a Project

[6431] Open Waveform Analyzer and select New Project from the File menu.

[6432] A dialog box appears asking the user to select a file name forthe new project. Project filenames have an ‘.APJ’ extension.

[6433] Script Menu

[6434] The script menu includes the following item:

[6435] Edit Script. . . Edit the PGL script for the current project.

[6436] Trace Dialog

[6437] Fields in the Trace properties dialog box (Trace>New Trace)include the following set forth in Table 29. TABLE 29 Field FunctionName Name to use for the trace. This name may be displayed in a box onthe left of the Trace window. The name can also be used as part of atrigger specification for this or any other trace. The name may be aC-style identifier. Width Width of the data in the trace in bits. TypeWhether the trace represents signed or unsigned data. Points Number ofpoints in the trace. This value was entered when the Trace window wascreated. It cannot be edited. Clock Period Rate at which data is readinto the trace. This value was entered when the Trace window wascreated. It cannot be edited. Expression Port(s) the trace is connectedto. The expression may be of the form ‘Terminal-Name(width)’ (as for theDK1 Connect plugin) or a Handel-C expression with expressions of theform ‘Terminal-Name(width)’ in place of variables. Dump File Enter afilename to capture the trace to a file. Two file formats are supported:ASCII files and Verilog Value Change Dump files. If the filename ends in‘.VCD’, ‘.DMP’ or ‘.DUMP’ a Value Change Dump file may be producedotherwise an ASCII file may be produced. The Browse button may be usedto select a filename. If no filename is entered, no dump file may beproduced. Variable If the dump file is a Verilog Value Change Dump file,enter the name which may be used as the reference name of the signal inthe VCD file. Trigger Specifies which sequences of words triggeringshould occur on If this box is empty, no trigger is used and all furthertrigger options are grayed out. Delay Specifies the trigger delay. Thismay be positive or negative. If a positive delay x is used, capturingbegins x time units after a trigger sequence occurs. If a negative delayis used, capturing begins x time units before a trigger sequence occurs.No trigger/ Select trigger mode. Single/Auto No trigger, triggering isdisabled. Single, a trace is captured once after a trigger sequenceoccurs. Auto, a trace is captured after every occurrence of a triggersequence. Pause on If this checkbox is ticked, capturing mayautomatically get Trigger paused after a trigger sequence has occurredand a trace has been captured. Interpolated Select display format fortrace. Waveform/ Stepped Waveform/ Numeric Symbolic Define Select howvalues are represented Symbols (gives dialog)

[6438] Trace Menu

[6439] Fields in the trace menu include those set forth in Table 30.TABLE 30 New Trace Create a new trace in the active Trace window. EditTrace Edit the selected trace in the active Trace window. Delete TraceRemove the selected trace from the active Trace window.

[6440] View Menu

[6441] Items available from the view menu include those set forth inTable 31. TABLE 31 Toolbar Toggle the toolbar on/off. Status Bar Togglethe Status Bar on/off. Zoom Max Zoom in to the maximum extent at thecentre of the active trace or pattern window. Zoom In Zoom in at thecentre of the active trace or pattern window. Zoom Out Zoom out from thecentre of the active trace or pattern window. Zoom Min Zoom out to themaximum extent from the centre of the active trace or pattern window.Zoom on Cursor Zoom in on the selected cursor in the active trace orpattern window. Jump to Cursor Scroll to the selected cursor in theactive trace or pattern window. New Cursor Create a new cursor in thecentre of the active trace or pattern window. Delete Cursor Delete theselected cursor from the active trace or pattern window.

[6442] Toolbar Icons

[6443]FIG. 94 illustrates several toolbar icons 9400 and their functions9402.

[6444] Window Menu

[6445] Items in the window menu include those set forth in Table 32.TABLE 32 Cascade Cascade all open windows. Tile Tile all open window.Arrange Icons Automatically arrange all minimized trace and patternwindows.

[6446] Analyzer Interface

[6447] The waveform analyzer interface consists of:

[6448] menu bar

[6449] tool bar

[6450] workspace area

[6451] any trace or pattern windows open

[6452] log output window: used by the program to report errors to theuser.

[6453] The user may group trace or pattern windows together in aproject. Projects contain a number of trace or pattern windows (thoseopen when during the last save of the project) and any scripts writtenin the project.

[6454] Menus

[6455] File Menu

[6456] New windows dialog (File>New)

[6457] Edit Menu

[6458] View Menu

[6459] Trace Menu

[6460] Trace dialog

[6461] Pattern Menu

[6462] Pattern properties dialog

[6463] Define symbols dialog

[6464] Script Menu

[6465] Capture Menu

[6466] Window Menu

[6467] Help Menu

[6468] Pattern Generation Language

[6469] The Waveform Analyzer uses Pattern Generation Language (PGL) as ascripting language to generate patterns. PGL has a similar expressivepower to regular expressions, but uses a C-like syntax.

[6470] The PGL can be used to trigger on a sequence of data and searchfor a sequence of data in a trace or pattern.

[6471] When executed, a PGL program generates a sequence of values. Whenused for triggering or searching, a program in PGL may match anysequence which it could generate, such as:

[6472] PGL statements

[6473] PGL finctions

[6474] Wild-card matching

[6475] Pattern Generation Language syntax

[6476] Using the Waveform Analyzer

[6477] The Waveform Analyzer connects to ports in Handel-C simulations.It displays outputs from Handel-C simulations as waveforms (traces).Thus a user can generate inputs to Handel-C simulations and display themas waveforms (patterns). The user can also manipulate the simulatedinputs and outputs in the same way that input and output signals from areal piece of hardware can be manipulated with a waveform analyzer. Apartial list of manners in which the waveform analyzer can be usedfollows.

[6478] Connecting traces to output ports in Handel-C simulations

[6479] Connecting patterns to input ports in Handel-C simulations

[6480] Connecting the Waveform Analyzer to ports connected to anothersimulation using the DK1Share plugin (connecting in parallel)

[6481] Measuring the differences between values and times in traces orpatterns using cursor marks.

[6482] Creating patterns by writing scripts using a Pattern GenerationLanguage, or by copying existing traces or patterns into a patternwindow. Patterns can also be read from a file.

[6483] Specifying triggers in the Pattern Generation Language

[6484] Capturing traces or generate patterns when a specified triggerappears in a trace

[6485] Finding a specified pattern in a trace or pattern window

[6486] Functions in PGL

[6487] PGL allows a user to define and call functions which can takeparameters. Only functions in an open project can be defined. Thefunctions are stored in the script.pgl file associated with thatproject. The user may edit this file outside the Waveform Analyzer.

[6488] Defining functions

[6489] To define functions, the project where the functions are to beused is opened. The ‘Edit Script’ icon on the toolbar is selected.Alternatively, Edit Script from the Script menu can be selected: thefile script.pgl is opened in Notepad . This file resides in the samedirectory as the project file.

[6490] Example

[6491] The following example defines two functions, one calledrising_edge and the other called rectangular_wave. rising_edge() {0;1;}rectangular_wave(hival,hicount,loval,locount,cycles) { loop (cycles) {loop(hicount) hival; loop(locount) loval; } }

[6492] The rectangular_wave function can be called with a statement likethe following:

[6493] rectangular_wave(1,5,0,5,10);

[6494] Wild-Card Matching in Triggering or Searching

[6495] When a PGL program is used for triggering or searching, it maycontain a ‘?’ character in any place where a number or variable couldgo. This character stands for ‘any value’. For example the compoundstatement {1;?;1;} would match against any 3 word sequence starting andending with a 1.

[6496] If ‘?’ is used as an actual parameter in a function call, whenthe function is called, the formal parameter which corresponds to the‘?’ has no value assigned to it.

[6497] If a variable is encountered which has no value assigned to it,it gets assigned a value according to the values encountered duringmatching.

[6498] Context-sensitive matches can be carried out in this way. Forexample, if a function is defined as follows:

[6499] count fives(a)

[6500] {a; loop(a) 5;}

[6501] and called using the statement ‘count_fives(?);’, it may matchany sequence consisting of a number, followed by that number of fives(including the sequence ‘0’). This feature should be used carefully,since it is possible to use it write functions which take a very longtime to match.

[6502] PGL Statements

[6503] The pattern generation language consists of one or morestatements terminated by semi-colons. Statements can include numbers,identifiers and wild-cards. A PGL statement may be one of the following:

[6504] Expression Statement

[6505] For example:

[6506] 1;

[6507] When an expression statement is executed, it generates the valueof the expression.

[6508] An expression statement used for matching may also be of theform:

[6509] !1;

[6510] This statement may match any value except 1.

[6511] Compound Statement

[6512] For example: {0;1;}

[6513] The statements enclosed in the curly brackets get executedsequentially.

[6514] Loop Statement

[6515] For example: loop(3) {0;1;}

[6516] The body of this loop may get executed 3 times.

[6517] Conditional Statement

[6518] For example:

[6519] if(a==1) {0;1;} else {1;0;}

[6520] Here, the statements which get executed depend upon the value ofthe variable a.

[6521] A user can build Boolean tests using the following operators touse for the condition in a conditional statement:

[6522] == != ! || &&

[6523] Switch Statement

[6524] For example: switch(a) { case 1: 0; 1; break; default: 1; 0;break; }

[6525] This switch statement achieves the same thing as the if-elsestatement described above.

[6526] Assert Statement

[6527] For example:

[6528] assert(a !=0);

[6529] This kind of statement can be used when matching to placeconstraints on matched variables.

[6530] Wild-Card Matching in PGL

[6531] If a PGL program is used for triggering or searching, a ‘?’character can be used in any place where a number or variable could go.This character stands for ‘any value’. For example the compoundstatement {1;?;1;} would match against any 3 word sequence starting andending with a 1.

[6532] If ‘?’ is used as an actual parameter in a function call, whenthe function is called, the formal parameter which corresponds to the‘?’ has no value assigned to it. If a variable is encountered which hasno value assigned to it, it gets assigned a value according to thevalues encountered during matching.Context-sensitive matches can becarried out in this way. For example, a function defined as follows:

[6533] count_fives(a)

[6534] {a; loop(a) 5;}

[6535] If this is called using the statement ‘count_fives(?);’, it maymatch any sequence consisting of a number, followed by that number offives. (Including the sequence ‘0’). This feature should be usedcarefully, since it is possible to use it write functions which take avery long time to match.

[6536] It is an error to use the ‘?’ expression, expression statementsstarting with ‘!’ and assert statements in PGL programs which are usedto generate patterns.

[6537] Connecting in Parallel

[6538] If it is desired to connect the Waveform Analyzer to ports thatare connected to another simulation, this may be done using theDK1Share.dll.

[6539] Example interface bus_out() seg7_output(unsigned 7 output1 =encode_out) with {extlib=“DK1Share.dll”, extinst=“ \Share={extlib=<7segment.dll>, extinst=<A>, extfunc=<PlugInSet>} \Share={extlib=<DK1Connect.dll>, extinst=<SS(7)>,extfunc=<DK1ConnectGetSet>} \ ”, extfunc=“DK1ShareGetSet” };

[6540] This example uses DK1Share.dll to share the output portseg7_output.output1 between the 7-segment display (connected to terminalA) and DK1Connect (connected to terminal SS(7)). A user can then tracethe output going to the 7-segment display by using SS(7) as theexpression in the Trace properties window

[6541] Finding a Sequence of Data in a Trace or Pattern

[6542] To find a sequence of data, the window that contains the trace orpattern to be searched is activated. If there are multiple traces orpatterns in the window, the desired trace or pattern is selected. TheEdit>Find menu item is selected. A PGL statement or function is enteredin the ‘Find what:’ box in the Find dialog.

[6543] Generating Patterns

[6544] Generating a Pattern from an Existing Trace or Pattern

[6545] Data is copied from a trace or pattern into the clipboard, andthen the contents of the clipboard are pasted into a pattern. A regionof a trace or pattern is selected, such as by dragging the mouse pointerover the region to select it. The region is copied to the clipboard byselecting Copy from the Edit menu or with the Copy icon on the toolbar.A pattern window is activated and either a region to paste over or acursor is selected. Paste is selected from the Edit menu or the Pasteicon on the toolbar is clicked on.

[6546] If a region has been selected, the clipboard contents are pastedinto the selected pattern starting at the beginning of the selectedregion. If a cursor was selected, the clipboard contents are pasted intothe window starting at the selected cursor location.

[6547] Generating a Pattern from a PGL Statement:

[6548] Script is selected as the pattern source in the PatternProperties dialog. The PGL statement or function call is entered in thebox to the right of the button.

[6549] Generating a Pattern from a File:

[6550] File is selected as the pattern source in the Pattern Propertiesdialog. The filename is entered in the box to the right of the button.The Browse button is used to browse for a file.

[6551] Pattern Generation Limitations

[6552] It is an error to use the ‘?’ expression, expression statementsstarting with ‘!’ and assert statements in PGL programs which are usedto generate patterns.

[6553] Complex Pattern-Generation

[6554] More complex patterns may require using a separate Handel-Cprogram to perform pattern generation.

[6555] Measuring Time and Value Differences in Windows

[6556] The user can measure the time between two events and thedifference in the value of a signal at two different times by placingmarks in Trace and Pattern Windows. These marks are represented bycolored triangles and may be referred to as cursors. One cursor isalways selected.

[6557] The cursor triangles are placed in the time pane of the trace orpattern window. If there is more than one cursor in the time pane, thetime pane displays the differences in time between cursors. Thebottom-centre pane displays the absolute position in time of allcursors. The differences in values between the cursors are displayed inthe bottom-left pane.

[6558] If multiple traces or patterns are displayed in a window, thevalues given are those of the selected trace or pattern.

[6559] Creating Cursors

[6560] Click on the New cursor icon on the toolbar or select New Cursorfrom the View menu. The cursor may be added to the center of the timepane of the active trace or pattern window.

[6561] Moving Cursors

[6562] Drag the cursor across the time pane

[6563] Selecting Cursors

[6564] Double-click a cursor. A red bar may appear beneath it to showthat it is selected. By default, the first cursor created is theselected cursor. Only one cursor can be selected at a time.

[6565] Deleting Cursors

[6566] Select the cursor to be deleted. Click the Delete Cursor icon onthe toolbar or select Delete Cursor from the View menu.

[6567] Connecting a Pattern to a Port

[6568] To connect a pattern to a port, the following steps areperformed:

[6569] 1. write and compile Handel-C code to connect a Handel-C port toa terminal using the DK1Connect and the DK1Sync plugins.

[6570] 2. set up a pattern window in the analyzer generating a signal tothe named terminal.

[6571] 3. simulate the Handel-C code and start transmitting the pattern

[6572] Writing the Handel-C Program

[6573] To write a program in Handel-C, open Handel-C, create a newproject and enter the following program: set clock = external “P1” with{extlib = “DK1Sync.dll”, extinst = “50”, extfunc = “DK1SyncGetSet”};interface bus_in(unsigned 1 in) ib1() with {extlib = “DK1Connect.dll”,extinst = “t(1)”, extfunc = “DK1ConnectGetSet”}; unsigned 5 count = 0;void main(void) { while(!count[4] ∥ !count[2]) { if (ib1.in == 0) {delay; if (ib1.in == 1) count++; } else delay; } }

[6574] Note: this program uses the DK1Connect plugin to connect the portib1.in to the terminal t(1). The program may only terminate when it hasdetected 20 rising edges from the port ib1.in.

[6575] Set Up a Pattern Window

[6576] To set up a pattern window, the following general steps areperformed:

[6577] 1. Open Waveform Analyzer.

[6578] 2. In Waveform Analyzer, select New from the File menu and createa new pattern with a filename, with 40 as the number of points and 50 asthe clock period.

[6579] 3. An empty Pattern window appears. Select New Pattern from thePattern menu or from the toolbar and enter the following properties inthe dialog box. Note Table 33. TABLE 33 Name: testpattern Width: 1 Type:Unsigned Source: Select Script radio button. Enter loop(20) {0;1;} inthe box. Variable: Grayed out. Destination: t(1) Trigger: Leave boxblank. Other settings should be grayed out. Delay: Grayed out with 0 asdefault Display: Check Stepped Waveform radio button

[6580] 4. Click OK.

[6581] Start Transmission

[6582] To start the transmission, tun the Handel-C simulation. Starttransmission by clicking the Run icon on the toolbar, or by selectingRun from the Capture menu. The Handel-C program should terminate shortlyafter transmission is started. To stop capturing click on the stop iconon the toolbar or select Stop from the Capture menu.

[6583] Starting the Waveform Analyzer

[6584] To start the Wavefonn Analyzer:

[6585] Select Start>Programs>DK1 Design Suite>Waveform Analyzer or,

[6586] Double-click the icon for the analyzer.exe file in the DK1\Bindirectory.

[6587] Connecting a Simulation to a Trace

[6588] To connect a simulation to a trace:

[6589] 1. write and compile Handel-C code to connect a Handel-C port toa terminal using the DK1Connect and the DK1Sync plugins.

[6590] 2. set up a trace window in the analyzer which reads the signalfrom the named terminal.

[6591] 3. simulate the Handel-C code and start capturing Sample Handel-Cprogram set clock = external “P1” with {extlib = “DK1Sync.dll”, extinst= “50”, extfunc = “DK1SyncGetSet”}; unsigned 3 x = 0; interfacebus_out() ob1(unsigned 3 out = x) with {extlib = “DK1Connect.dll”,extinst = “t(3)”, extfunc = “DK1ConnectGetSet”}; void main(void) {while(1) x++; }

[6592] Note: this program uses the DK1Connect plugin to connect the portob1.out to the terminal t(3). Compile the program but do not run it.

[6593] Set Up a Trace Window

[6594] To set up a trace window, open Waveform Analyzer. In WaveformAnalyzer, select New from the File menu and create a new trace. Selectthe browse button to specify a filename and location. Set Default ClockPeriod to 50 and Default No. points to 40.

[6595] An empty Trace window should appear. Select New Trace from theTrace menu or from the toolbar and enter the following properties in thedialog box. Note Table 34. TABLE 34 Name: testtrace Width: 3 Type:Unsigned Expression: t(3) Dump File: Leave blank Variable: Grayed outTrigger: Leave box blank. Other settings should be grayed out. Delay:Grayed out with 0 as default Display: Check the Stepped Waveform radiobutton. Click OK.

[6596] Start Capturing

[6597] Start capturing by clicking the Run icon on the toolbar, or byselecting Run from the Capture menu. A red dashed line should appear(umping around all over the place). This line marks the current positionin the trace. Run the Handel-C simulation. To stop capturing click onthe stop button on the toolbar or select Stop from the Capture menu. TheHandel-C simulation is stopped.

[6598] Using the Pattern Generation Language

[6599] The Pattern Generation Language (PGL) can be used to:

[6600] Generate patterns that are fed into a port

[6601] Identify a sequence of data in a trace to use as a trigger. Thetrigger can be used to start recording the trace or to start generatinga pattern. If a trigger associated with a trace or pattern has beendefined, it may be re-used as a trigger for other traces or patterns.

[6602] Find a sequence of data in a trace or a pattern

[6603] Entering PGL Statements

[6604] PGL is entered as a single PGL statement in the properties dialogfor a trace or pattern. The PGL statement may be a compound statement ora function call. PGL functions may be written in the script.pgl fileassociated with a project.

[6605] Complex Pattern-Matching and Pattern-Generation

[6606] A separate Handel-C program can be written to perform patterngeneration or pattern matching. For triggering, a trigger signal can beoutput from this Handel-C program to Waveform Analyzer, and then asimple PGL statement can be used to trigger on this signal.

[6607] Using Triggers

[6608] A sequence of data to be used as a trigger can be specified.Alternatively, an existing specification can be used.

[6609] When the trigger sequence occurs, the following are enabled:

[6610] Start capturing a trace before, at or after the specified trigger

[6611] Start generating a pattern at or after the specified trigger

[6612] Stop capturing a trace or generating a pattern.

[6613] To Specify a Trigger:

[6614] Open the Pattern or Trace Properties dialog. Enter trace name: inthe Trigger box followed by a PGL statement. trace name is the name of apre-defined trace (Note that it may be followed by a colon). The PGLstatement may be matched against the named trace. For example:

[6615] b : {0;1;}

[6616] would cause the pattern to be generated on a rising edge of traceb. T he appropriate radio button is selected. Radio buttons includethose set forth in Table 35. TABLE 35 No trigger: No triggering Single:Transmit or capture the first time the trigger is received Auto:Transmit or capture each time the trigger is received.

[6617] To re-use a specified trigger, “name is entered in the Triggerbox, where name is the name of the trace or pattern that uses a trigger.Note that name may be preceded by a double-quote. For example:

[6618] “trace1

[6619] would cause the trace or pattern whose details are being enteredto use the same trigger as the trace named trace1. The appropriate radiobutton is selected. Note Table 36. TABLE 36 No trigger: No triggeringSingle: Transmit or capture the first time the trigger is received Auto:Transmit or capture each time the trigger is received.

[6620] To Specify the Delay Between the Trigger and the Action

[6621] Specify a trigger and enter the number of time units in the Delaybox on the Properties dialog. The delay is in the time units for thatwindow. Delays can be positive or negative for a trace, (negative delayscapture before the trigger, positive after) and positive or zero for apattern.

[6622] To Pause on Trigger

[6623] Specify a trigger and check the Pause box.

[6624] File Formats

[6625] A preferred embodiment of the Waveform Analyzer supports twodifferent file formats for storing waveform data. These can include, forexample ASCII files, where data elements are written in ASCII andseparated by whitespace; and Value Change Dump (VCD) files. This fileformat is specified in the IEEE 1364 standard.

[6626] A VCD file can contain any number of variables. If several tracesare dumped to the same VCD file, simply enter the same VCD filename inthe ‘Dump File’ box for every trace which should be written to thatfile. The ‘Variable’ box in the Trace dialog is used to enter areference name which may be used in the VCD file for the signal.

[6627] When reading a pattern from a VCD file, the ‘Variable’ box in thePattern dialog is used to enter the reference name of the variable inthe VCD file which needs to be read. The file extension of a Dump Fileor Pattern source file determines the file format. If the extension is‘.VCD’, ‘.DMP’ or ‘.DUMP’ the file is a Value Change Dump file,otherwise it is an ASCII file.

[6628] Pattern Generation Language Syntax

[6629] The following are syntax statements used during programming:subprogram_def : := identifier ( [parameter-list] ) compound-statementparameter-list : := identifier |   identifier , parameter-liststatements : := statement |   statement statements statement ::=subprogram_call | compound-statement | loop-statement | if-statement |if-else-statement | switch-statement | break-statement |expression-statement | assert-statement subprogram_call : := identifier( [expression-parameter-list] ) ; expression-parameter-list : :=expression |   expression, expression-parameter-list compound-statement: := { statements } loop-statement : := ┐oop ( expression ) statement |  loop forever statement if-statement ::= if (boolean-expression )statement if-else-statement : := if ( boolean-expression ) statementelse statement switch-statement : := switch ( expression ) { case-list[default : statements] } case-list : := case |   case case-list case ::= case number: statements break-statement : := break ;expression-statement : := expression ; |    ! expression ;assert-statement : := assert ( boolean-expression ) ; boolean-expression: := expression == expression |    expression ! = expression |   expression |    boolean-expression && boolean-expression |   boolean-expression | | boolean-expression |    ! boolean-expression |   ( boolean-expression ) (here, && has higher precedence than | | andboth are left associative) expression : := ? |   number |   identifier

[6630] Numbers may be binary, octal, decimal or hexadecimal integers,and use the same syntax as Handel-C. (i.e. 0b . . . for binary numbers,0 . . . for octal numbers, 0x . . . for hex numbers, all other numbersare treated as decimals).

[6631] Identifiers are C-style identifiers.

[6632] Time Units

[6633] Time units are not explicitly defined in Waveform Analyzer. AnyHandel-C simulation to which the Waveform Analyzer is connected shoulduse the DK1Sync plugin with the clock period for the simulationspecified in an extinst string. When the clock period is entered for atrace or pattern window, the sample rate for the trace or patterns aredetermined in the window relative to the clock period specified for theHandel-C simulation. If the clock period for the trace or pattern is thesame as the clock period specified in the extinst string in the Handel-Cprogram, the trace or pattern may be sampled on every cycle of theHandel-C program. If the clock period for the trace or pattern is twicethe clock period specified in the extinst string in the Handel-Cprogram, the trace or pattern may be sampled on every other cycle of theHandel-C program and so on. It is a matter of convenience to make theclock periods correspond to the clock periods that may be used in thetarget hardware. Preferably, the VCD file reader/writer used by WaveformAnalyzer assumes that the time units used are nanoseconds.

[6634] While various embodiments have been described above, it should beunderstood that they have been presented by way of example only, and notlimitation. Thus, the breadth and scope of a preferred embodiment shouldnot be limited by any of the above-described exemplary embodiments, butshould be defined only in accordance with the following claims and theirequivalents.

What is claimed is:
 1. A method for equipping a simulator with plug-ins,comprising the steps of: (a) executing a first simulator for generatinga first model, wherein the first simulator is written in a firstprogramming language; (b) executing a second simulator for generating asecond model, wherein the second simulator is written in a secondprogramming language, and the first simulator interfaces with the secondsimulator via a plug-in; and (c) co-simulating utilizing the first modeland the second model.
 2. A method as recited in claim 1, wherein anaccuracy and speed of the co-simulation is user-specified.
 3. A methodas recited in claim 1, wherein the first simulator is cycle-based andthe second simulator is event-based.
 4. A method as recited in claim 1,wherein the co-simulation includes interleaved scheduling.
 5. A methodas recited in claim 1, wherein the co-simulation includes fullypropagated scheduling.
 6. A method as recited in claim 1, wherein thesimulations are executed utilizing a plurality of processors.
 7. Amethod as recited in claim 1, wherein the first simulator may beexecuted ahead of or behind the second simulator.
 8. A method as recitedin claim 1, wherein the first simulator is coupled to the secondsimulator via a network.
 9. A computer program product for equipping asimulator with plug-ins, comprising: (a) computer code for executing afirst simulator for generating a first model, wherein the firstsimulator is written in a first programming language; (b) computer codefor executing a second simulator for generating a second model, whereinthe second simulator is written in a second programming language, andthe first simulator interfaces with the second simulator via a plug-in;and (c) computer code for co-simulating utilizing the first model andthe second model.
 10. A computer program product as recited in claim 9,wherein an accuracy and speed of the co-simulation is user-specified.11. A computer program product as recited in claim 9, wherein the firstsimulator is cycle-based and the second simulator is event-based.
 12. Acomputer program product as recited in claim 9, wherein theco-simulation includes interleaved scheduling.
 13. A computer programproduct as recited in claim 9, wherein the co-simulation includes fullypropagated scheduling.
 14. A computer program product as recited inclaim 9, wherein the simulations are executed utilizing a plurality ofprocessors.
 15. A computer program product as recited in claim 9,wherein the first simulator may be executed ahead of or behind thesecond simulator.
 16. A computer program product as recited in claim 9,wherein the first simulator is coupled to the second simulator via anetwork.
 17. A system for equipping a simulator with plug-ins,comprising: (a) logic for executing a first simulator for generating afirst model, wherein the first simulator is written in a firstprogramming language; (b) logic for executing a second simulator forgenerating a second model, wherein the second simulator is written in asecond programming language, and the first simulator interfaces with thesecond simulator via a plug-in; and (c) logic for co-simulatingutilizing the first model and the second model.
 18. A system as recitedin claim 17, wherein an accuracy and speed of the co- simulation isuser-specified.
 19. A system as recited in claim 17, wherein the firstsimulator is cycle-based and the second simulator is event-based.
 20. Asystem as recited in claim 17, wherein the co-simulation includesinterleaved scheduling.